[ Main
| Projects
| People
| Meetings
| Library
]
Lazy Learning for modeling and control
The goal of this project is to develop a methodology to model and
control complex systems starting from a given set of examples of their
behavior.
Learning and control: the current state-of-the-art
The problem of modeling from observed data has been the object of
several disciplines, from linear statistics and nonlinear regression
to system identification and machine learning.
In the literature dealing with this problem, many different approaches
have emerged.
However, all modeling methods imply a structural phase in which
the structure of the model is defined, and a parametric phase
in which the model is fitted to the data by means of an optimization
procedure.
A first possible classification of the approaches to modeling is based
on structural considerations: global modeling versus
local modeling techniques.
Global modeling consists in describing the behavior of the system at
hand by means of a single model that cover all the space of possible
operating regimes - for example a global linear model or a
neural network.
On the other hand, local modeling provides a description of the
system by combining several models pertaining to different
operating regimes.
Each of the models is obtained giving full attention to a reduced
portion of the space of the possible behaviors, yielding a
more accurate description even when simple approximators (for example linear
models) are used.
It is precisely the simple form of the local models, and consequently
the possibility of handling them using standard and well-known tools from
linear statistics, that makes the local approach appealing.
A second reason for the growing popularity of local methods is that
the decomposition of the learning task into sub-tasks makes the whole
process easier to manage, allowing for instance the integration of
physical models into the black-box description.
Radial Basis Function Networks (Moody and Darken, 1989), Fuzzy
Models (Takagi and Sugeno, 1985) and, as far as control is concerned Gain
Scheduling (Rugh, 1991), are all local approaches.
The modeling process, adopting such methods, starts with a training
phase during which the examples available are used both to extract the
local descriptions of the system and to define a partition of the
space of the operating regimes.
Any request for information is fulfilled by interpolating the answers of
different local models.
A second possible classification is based on the lazy/eager
distinction (Aha, 1997). All the methods described above are
eager methods: the examples are compiled into an intensional
concept description, the initial database is then discarded, and the
description obtained is used to reply to information requests.
On the other hand, in lazy modeling methods the examples,
which are never discarded, are the only representation of the system
being modeled, and no other global description is defined.
Lazy learning (Atkeson, Moore, and Schaal, 1997), also known as
just-in-time learning (Cybenko, 1996), is inspired
by nearest-neighbor techniques
and by nonparametric statistics. It defers processing of the
examples until an explicit request for information is received. When
this happens, the database available is searched for those examples
that, according to some measure of distance, are considered most
relevant to answer the query. These examples are used to extract a
local description of the system - for example through a local linear
model - and finally to fulfill the request. Both the answer and any
intermediate results are then discarded and each following request for
information will make the full process start again.
As far as the distinction between local and global modeling is
concerned, lazy learning belongs to the class of local methods
but pushes the idea of locality to the extreme.
Most of the appealing features peculiar to local modeling are
maintained: the adoption of linear local models guarantees both
readability and applicability of reliable statistical tools. Some
feature are even enhanced: the lazy approach is able to deal
effectively with situations in which the examples are not evenly
distributed or when the noise affecting the data is described by
different distributions for different operating regimes. Moreover,
since the training phase is computationally inexpensive and simply
amounts to a storage of the available examples into a database, the
lazy approach is particularly suitable when the examples are
not all available from the beginning but
are collected on-line.
In this case a new example observed requires only an update to the
database. It is worth noticing that, contrary to global approximators,
lazy learning does not suffer from data interference
(Atkeson, Moore, and Schaal, 1997). That is, acquiring examples about an
operating regime does not degrade modeling performance about others.
The drawbacks of lazy learning are mainly associated with the
necessity of a possibly large amount of memory to store the data, and
with the fact that each request for information involves starting the
identification of a local model from scratch. Nevertheless the
evolution of computer hardware has already partially eased these
problems.
Even if the scientific community is becoming more and more aware of
the great potential of the lazy learning approach, some
problems have not found yet a widely accepted answer.
Among them, the most challenging are related to the structural aspects
of the algorithm:
the definition of a
distance metric to evaluate the relevance of the examples neighboring
the query point, the selection of the structure of the local approximator
(e.g. selection among polynomials of various degree),
the choice of the number of examples to be used
for each identification of a local model and their relative weight,
and finally the selection of the features to be considered.
Description of the research project
Although nonlinearity characterizes most real systems, methods for
modeling and control design are considerably more powerful and
theoretically better founded for linear systems than for nonlinear
ones.
The guiding principle of this project is therefore to cope with
nonlinear systems by
adapting well-known linear methods, instead of developing new ones
from scratch.
The aim of the project is to gain more insight into the existing
modeling techniques collectively known as lazy learning,
especially regarding the various ways proposed to tackle the problems
introduced in the previous section. The final goal is to define a
consistent lazy learning methodology to model and control
complex systems from possibly noisy and uncertain observed data.
The Lazy Learning group at IRIDIA proposes a method that relies heavily
on the use of cross-validation techniques (Efron and Tibshirani, 1993)
to identify the structural parameters of the lazy learning
algorithm.
We have already worked on a method to define, for each given query
point, the widest surrounding region of local
linearity (Birattari, Bontempi, and Bersini, 1999).
The method, based on a recursive least squares algorithm, allows a fast
identification and validation of different local models fitted on an
increasing number of neighboring examples.
The process is stopped when a significant departure from the region of
local linearity is detected.
The key element of this method is a recursive implementation of the
PRESS statistic (Myers, 1990) which is a fast leave-one-out
validation procedure for linear model.
The same validation technique may be used also to select, for each
query point, the parametric structure of the local approximator and to
perform a selection of the relevant features.
We suggest that this validation procedure may be used also for a local
definition of the distance metric and for the detection of
outliers. This extension will eventually allow the tuning of all the
parameters of the learning algorithm on a query-by-query basis.
As far as control is concerned, we propose an approach inspired by the
self-tuning regulator (STR) architecture (Astrom and Wittenmark, 1990).
At each time-step, the lazy learning algorithm yields
the local linearization of the system about the current operating
regime.
The description obtained is used to design a local controller by
exploiting classical discrete-time linear control techniques -
e.g. generalized minimum variance
and pole placement.
The main difference with conventional adaptive control
techniques lies in the parameter estimation scheme. In the STR,
identification is performed by a recursive parameter estimator which
updates the same linear model when a new example is
observed.
In the proposed approach there is no global description but at
each time-step the system dynamics is linearized in the neighborhood of
the current operating regime (Bontempi, Birattari, and Bersini, 1999).
We expect that the lazy learning based control architecture may
be a valid alternative to neuro-control due to the local use of
well understood linear techniques which will allow an analysis of its
stability properties.
Another fundamental issue on which the project focuses is the
analysis of the links between lazy learning and the
neuro-fuzzy approach.
The incremental algorithm we propose for identifying local models may
possibly be considered for identifying the local models of a fuzzy system.
On the other hand, methods of stability analysis developed for fuzzy
control could inspire equivalent methods for control techniques based
on lazy learning.
The Lazy Learning Toolbox
In the framework of the Lazy Learning for Modeling and Control
project, we have implemented a toolbox for Matlab. For more information,
please see the Lazy Learning Toolbox Home Page.
[ Gianluca Bontempi
| Mauro Birattari
| Hugues Bersini ]
References
- D.W. Aha.
Editorial.
Artificial Intelligence Review, 11(1-5):1-6, 1997.
(Special Issue on Lazy Learning).
- K.J. Astrom and B. Wittenmark.
Computer-controlled Systems: Theory and Design.
Prentice-Hall International Editions, 1990.
- C.G. Atkeson, A.W. Moore, and S. Schaal.
Locally weighted learning.
Artificial Intelligence Review, 11(1-5):11-73, 1997.
- M. Birattari, G. Bontempi, and H. Bersini.
Lazy learning meets the recursive least squares algorithm.
Advances in Neural Information Processing Systems 11,
pp 375-381. MIT Press, Cambridge, MA, 1999.
- G. Bontempi, M. Birattari, and H. Bersini.
Lazy learning for local modeling and control design.
International Journal of Control,
72(7/8):643-658. 1999.
- G. Cybenko.
Just-in-time learning and estimation.
In S. Bittanti and G. Picci, editors, Identification,
Adaptation, Learning. The Science of Learning Models from
data, NATO ASI
Series, pp 423-434. Springer, 1996.
- B. Efron and R.J. Tibshirani.
An Introduction to the Bootstrap.
Chapman and Hall, New York, NY, 1993.
- J. Moody and C.J. Darken.
Fast learning in networks of locally-tuned processing units.
Neural Computation, 1(2):281-294, 1989.
- R.H. Myers.
Classical and Modern Regression with Applications.
PWS-KENT, Boston, MA, 1990.
- W.J. Rugh.
Analytical framework for gain scheduling.
IEEE Control Systems, 11(1):79-84, 1991.
- T. Takagi and M. Sugeno.
Fuzzy identification of systems and its applications to modeling and
control.
IEEE Transactions on System, Man and Cybernetics,
15(1):116-132, 1985.
[ Main
| Projects
| People
| Meetings
| Library
]
Last updated Oct 19, 1999 |
Comments to Webmaster