IRIDIA [ Main | Projects | People | Meetings | Library ] IRIDIA

Lazy Learning for modeling and control

The goal of this project is to develop a methodology to model and control complex systems starting from a given set of examples of their behavior.

Learning and control: the current state-of-the-art

The problem of modeling from observed data has been the object of several disciplines, from linear statistics and nonlinear regression to system identification and machine learning. In the literature dealing with this problem, many different approaches have emerged. However, all modeling methods imply a structural phase in which the structure of the model is defined, and a parametric phase in which the model is fitted to the data by means of an optimization procedure. A first possible classification of the approaches to modeling is based on structural considerations: global modeling versus local modeling techniques.
Global modeling consists in describing the behavior of the system at hand by means of a single model that cover all the space of possible operating regimes - for example a global linear model or a neural network. On the other hand, local modeling provides a description of the system by combining several models pertaining to different operating regimes. Each of the models is obtained giving full attention to a reduced portion of the space of the possible behaviors, yielding a more accurate description even when simple approximators (for example linear models) are used. It is precisely the simple form of the local models, and consequently the possibility of handling them using standard and well-known tools from linear statistics, that makes the local approach appealing. A second reason for the growing popularity of local methods is that the decomposition of the learning task into sub-tasks makes the whole process easier to manage, allowing for instance the integration of physical models into the black-box description. Radial Basis Function Networks (Moody and Darken, 1989), Fuzzy Models (Takagi and Sugeno, 1985) and, as far as control is concerned Gain Scheduling (Rugh, 1991), are all local approaches. The modeling process, adopting such methods, starts with a training phase during which the examples available are used both to extract the local descriptions of the system and to define a partition of the space of the operating regimes. Any request for information is fulfilled by interpolating the answers of different local models.
A second possible classification is based on the lazy/eager distinction (Aha, 1997). All the methods described above are eager methods: the examples are compiled into an intensional concept description, the initial database is then discarded, and the description obtained is used to reply to information requests.
On the other hand, in lazy modeling methods the examples, which are never discarded, are the only representation of the system being modeled, and no other global description is defined. Lazy learning (Atkeson, Moore, and Schaal, 1997), also known as just-in-time learning (Cybenko, 1996), is inspired by nearest-neighbor techniques and by nonparametric statistics. It defers processing of the examples until an explicit request for information is received. When this happens, the database available is searched for those examples that, according to some measure of distance, are considered most relevant to answer the query. These examples are used to extract a local description of the system - for example through a local linear model - and finally to fulfill the request. Both the answer and any intermediate results are then discarded and each following request for information will make the full process start again.
As far as the distinction between local and global modeling is concerned, lazy learning belongs to the class of local methods but pushes the idea of locality to the extreme. Most of the appealing features peculiar to local modeling are maintained: the adoption of linear local models guarantees both readability and applicability of reliable statistical tools. Some feature are even enhanced: the lazy approach is able to deal effectively with situations in which the examples are not evenly distributed or when the noise affecting the data is described by different distributions for different operating regimes. Moreover, since the training phase is computationally inexpensive and simply amounts to a storage of the available examples into a database, the lazy approach is particularly suitable when the examples are not all available from the beginning but are collected on-line. In this case a new example observed requires only an update to the database. It is worth noticing that, contrary to global approximators, lazy learning does not suffer from data interference (Atkeson, Moore, and Schaal, 1997). That is, acquiring examples about an operating regime does not degrade modeling performance about others. The drawbacks of lazy learning are mainly associated with the necessity of a possibly large amount of memory to store the data, and with the fact that each request for information involves starting the identification of a local model from scratch. Nevertheless the evolution of computer hardware has already partially eased these problems.
Even if the scientific community is becoming more and more aware of the great potential of the lazy learning approach, some problems have not found yet a widely accepted answer. Among them, the most challenging are related to the structural aspects of the algorithm: the definition of a distance metric to evaluate the relevance of the examples neighboring the query point, the selection of the structure of the local approximator (e.g. selection among polynomials of various degree), the choice of the number of examples to be used for each identification of a local model and their relative weight, and finally the selection of the features to be considered.

Description of the research project

Although nonlinearity characterizes most real systems, methods for modeling and control design are considerably more powerful and theoretically better founded for linear systems than for nonlinear ones. The guiding principle of this project is therefore to cope with nonlinear systems by adapting well-known linear methods, instead of developing new ones from scratch.
The aim of the project is to gain more insight into the existing modeling techniques collectively known as lazy learning, especially regarding the various ways proposed to tackle the problems introduced in the previous section. The final goal is to define a consistent lazy learning methodology to model and control complex systems from possibly noisy and uncertain observed data.
The Lazy Learning group at IRIDIA proposes a method that relies heavily on the use of cross-validation techniques (Efron and Tibshirani, 1993) to identify the structural parameters of the lazy learning algorithm.
We have already worked on a method to define, for each given query point, the widest surrounding region of local linearity (Birattari, Bontempi, and Bersini, 1999). The method, based on a recursive least squares algorithm, allows a fast identification and validation of different local models fitted on an increasing number of neighboring examples. The process is stopped when a significant departure from the region of local linearity is detected. The key element of this method is a recursive implementation of the PRESS statistic (Myers, 1990) which is a fast leave-one-out validation procedure for linear model. The same validation technique may be used also to select, for each query point, the parametric structure of the local approximator and to perform a selection of the relevant features.
We suggest that this validation procedure may be used also for a local definition of the distance metric and for the detection of outliers. This extension will eventually allow the tuning of all the parameters of the learning algorithm on a query-by-query basis.
As far as control is concerned, we propose an approach inspired by the self-tuning regulator (STR) architecture (Astrom and Wittenmark, 1990). At each time-step, the lazy learning algorithm yields the local linearization of the system about the current operating regime. The description obtained is used to design a local controller by exploiting classical discrete-time linear control techniques - e.g. generalized minimum variance and pole placement. The main difference with conventional adaptive control techniques lies in the parameter estimation scheme. In the STR, identification is performed by a recursive parameter estimator which updates the same linear model when a new example is observed. In the proposed approach there is no global description but at each time-step the system dynamics is linearized in the neighborhood of the current operating regime (Bontempi, Birattari, and Bersini, 1999). We expect that the lazy learning based control architecture may be a valid alternative to neuro-control due to the local use of well understood linear techniques which will allow an analysis of its stability properties.
Another fundamental issue on which the project focuses is the analysis of the links between lazy learning and the neuro-fuzzy approach. The incremental algorithm we propose for identifying local models may possibly be considered for identifying the local models of a fuzzy system. On the other hand, methods of stability analysis developed for fuzzy control could inspire equivalent methods for control techniques based on lazy learning.

The Lazy Learning Toolbox

In the framework of the Lazy Learning for Modeling and Control project, we have implemented a toolbox for Matlab. For more information, please see the Lazy Learning Toolbox Home Page.


[ Gianluca Bontempi | Mauro Birattari | Hugues Bersini ]

References


IRIDIA [ Main | Projects | People | Meetings | Library ] IRIDIA

Last updated Oct 19, 1999 | Comments to Webmaster