NARMAX (Nonlinear AutoRegressive, Moving Average eXogenous) models describe nonlinear systems in terms of linear-in-the-parameters difference equations, which represent the current output with present and past inputs and, past outputs. Identifying a NARMAX model requires two things: (1) structure detection and (2) parameter estimation. Structure detection can be divided into: (1a) model order selection and (1b) selecting which parameters to include in the model. We consider model order selection as part of structure detection since, theoretically, there are an infinite number of candidate terms that could be considered initially. Establishing the model order, then, limits the choice of terms to be considered. Good parameter estimation methods exist if the model order is known. However, model order selection remains a problem. Depending on the order of the system the number of candidate terms can be very large. Selection of a subset of these candidate terms is necessary for an efficient system description. In fact, many NARMAX systems are described by only a few terms. Structure detection remains an unresolved issue in system identification for over-parameterized models.
For model order estimation, a model having the correct order will minimize the expected value of the prediction errors (Shao 1996). However, correct model order is not always evident and therefore statistical errors may lead to inconsistent or inaccurate estimates of model order for particular realizations. One approach would be to acquire extensive data sets to minimize expected error. An alternative is to find some way of improving the estimate of expected error in prediction with limited data.
Two methods are widely in regression analysis for structure computation; viz., the t-test and stepwise regression. The t-test relies on accurate estimates of parameter variances to determine significance while stepwise regression relies on the incremental change in residual sum of squares (RSS) resulting from adding a parameter. Both methods need accurate estimates of model residuals to determine structure. However, unbiased estimates of residuals are difficult to obtain unless the structure is correct. Since the number of candidate terms can become very large for even moderately complex models, the estimated residuals are highly biased (underdispersed), making structure detection difficult. Hence, both have difficulty with highly over-parameterized models.
Recently, bootstrap methods have received considerable attention due to the availability of affordable and powerful computers. The bootstrap method is a numerical procedure for estimating statistical parameters requiring few assumptions. The conditions needed for bootstrap methods are quite mild; namely, that the errors be identically distributed and have zero mean. Consequently, we hypothesize that the bootstrap method might be useful for model order selection and structure detection of nonlinear models.
In this presentation, I demonstrate that the bootstrap is a useful statistical tool for identification of parametric nonlinear systems. The results show that my bootstrap model order selection (BMOS) and bootstrap structure detection (BSD) algorithms are robust methods for selecting the order and structure of NARMAX models with a high probability of success and are resistant to noise. Combined, these techniques provide accurate estimates of parameter statistics without relying on assumptions made by traditional procedures and yield a parsimonious description of the system.