In Part 1 of the project, Linear Regression approximation of a non-leaner static system was preformed to estimate the parameters of a system based on its input and output data set.
The identification and validation data set were loaded and plotted on MATLAB in order to examine a non-linear static system for outliners. Furthermore, using the assigned flag number â05â, the given function in part one of the project splits the I/O data to the identification and validation data set .
The output in terms of the input data for the assigned noisy data set.
In order to preform cross validation approach, it is required to compare the quality of identified models by validating them on a data set where non of the data sets are estimated. Therefore, the input data is divided into two subsets, which are Identification and Validation data. Based on the least-squares estimate, the identification data will be used to compare the optimal parameters ? while the validation data is used to verify how the estimated model behaves with a new data.
The advantages of this approach are that the comparison will be realistic without probabilistic arguments and any assumptions made about the true system. However, there are some disadvantages such that that a new data must be saved for the validation and the data cannot all be used for identification. Moreover, over-fitting of a system is caused by increasing the model which is used to decrease the simulation; however, over-fitting data is not sufficient enough for validation. Therefore, cross validation method was used as the data can be validated without increasing the order and having any over-fitting issue.
In this subsection of linear regression Modeling, the Normal Equation and estimate of the unknown parameters ? were derived using the least-squares estimation method. Using the prediction error approach, the parameter estimation can be viewed as a least-square problem. Using the I/O collected data, we can predict the next output; therefore, the regression form can be obtained as seen below:
Linear Regression Model ⦠(1)
Least-Square Estimate ⦠(2)
Least-Squares estimation Error ⦠(3)
Where,
Y: Vector of output observations : Vector of unknown parameters : Vector of unknown parameters
Overall, Â is a set of over determined linear equations. Therefore, to solve these types of equations, the least-squares estimation approach was applied. The least squares estimation error shown in the equation (3) above is defined to be the difference between the observed output data points and the predicted output by the regression model.