Master Thesis - Unraveling topology of Dynamic Neural Networks
Many reasonably simple dynamical systems can be explained directly through differential equations that are derived from fundamental physical laws. However, more intricate and complex systems pose challenges for this method. In such cases, data-driven modelling provides an alternative approach that approximates the system’s behaviour by learning from real-world observations i.e. sample input-output data. Data-driven modelling provides for faster and computationally cheaper simulations that can be used for parameter estimation and uncertainty quantification.
Peter Meijer developed NEUREKA - an application that uses DNNs for simulating electrical circuits by their respective reduced-order models. They propose a methodology for constructing efficient models for nonlinear, multi-dimensional, dynamical systems, given input-output data. However, in the proposed methodology, unavailability of the topology of the DNNs beforehand makes the architecture search very computationally expensive. Moreover, local minima are found during the optimization of the parameters.
Prof. Wil Schilders’ work tackles these drawbacks for linear systems. They first apply state-space modelling MOESP algorithm for identifying a reduced-order Linear Time-Invariant (LTI) system from input-output data. Starting from this state-space model, they find a mapping from the state-space matrices to the parameters i.e. the number of neurons and hidden layers in DNNs, thereby eliminating the need for architecture search. Based on this mapping, they also propose a strategy to initialise weights of DNNs to ensure a good initial guess. However, the Bartels-Stewart algorithm used in this mapping requires the state matrix’s eigenvalues to have an algebraic multiplicity equal to one.
In our work, we generalise the ideas developed in the earlier work and propose a way to relax this algebraic multiplicity constraint. Given any LTI system describing the underlying dynamics (either obtained from input-output data using a subspace identification algorithm like MOESP or a known mathematical model), we determine the number of layers and neurons in each layer of the DNN. We also determine how these neurons are connected to each other based on the structure of the state space matrices. Lastly, we formulate a strategy to initialise the weights of the DNN using the domain knowledge in the form of the reduced state-space model. The goal of this thesis was to implement and study how one can train the dynamic neural networks using reverse-mode differentiation and to test the implementation on some numerical examples.
Find a copy of my master thesis here.