Neural Networks

My Experience with Neural Networks

The following publications indicate what I have been doing in neural networks

Introduction to Neural Networks

The field of cybernetics recognizes that information processing originates with living creatures in their struggle for survival. From this viewpoint, we can begin to consider information processing techniques that are inherently different from those used in conventional computations. Neurocomputers, based on principles found in living systems, are highly parallel structures designed to directly process information emanating from the external world, without the intermediate step of symbolic representation. Central to neurocomputers are artificial neural networks (ANNs). One of the general goals of artificial neural network researchers is to circumvent the inherent limits of serial digital computation.

An ANN is a network of artificial neurons. These artificial neurons are specialized computational elements performing simple computational functions. The manner in which these neurons are interconnected defines the topology or architecture of the network. Whereas a classical digital computer is programmed, an ANN is trained. Adjusting the strengths of interconnections (weights) among the neurons constitutes training or learning. The concept of memory in a conventional computer corresponds to the concept of weight settings. The processing and storage functions in ANNs are not centralized and distinct; each neuron acts as a processor and the set of weights associated with that neuron act as distributed storage. In a typical ANN one can expect to find hundreds of processors and thousands of storage elements.

In the parlance of physical sciences, a neural network is a nonlinear dynamical system which is capable of mimicking some aspect of cognition. If each neuron is visualized as an analog operational amplifier, then a fully parallel analog computer would serve as an excellent first approximation to one class of neural networks.

Methods based on neural networks have a distinctly different flavor from those based on artificial intelligence (AI) techniques. In classical AI, a symbolic representation of the external world is the starting point and a digital computer is used as a symbol manipulating engine. The symbol string obtained as a solution is converted back into a physical representation for human cognition. Expert systems, an off-spring of the AI school, attempt to capture the domain knowledge of a problem in terms of IF..THEN..ELSE kind of rules. Formulation of these rules is a tedious process and systems built on this philosophy tend to be "brittle"; that is, any new knowledge may force a radical redefinition of the rule base. Artificial neural networks, by virtue of their training, exhibit a more "plastic" behavior. For this reason, ANNs more appropriately belong to a class of methods that are being dubbed as "soft computing."

The term "soft computing" can be defined as a collection of methods based on principles derived from neural networks, genetic algorithms, fuzzy set theory, artificial life, and so on. The goal of this emerging computational discipline is to solve computationally hard problems not by brute force but by borrowing principles of information processing from nature. Beginning with the early 1980's, ANNs, as well some of the other soft computing methods, have been used systematically to solve a variety of computationally hard problems such as pattern recognition under real world conditions, fuzzy pattern matching, nonlinear discrimination of noisy signals, combinatorial optimization, nonlinear real-time control, and so on.

 

Our Experience with Neural Nets

The burgeoning literature in the field of ANN research is full of examples of the validity, advantages and shortcomings of the new paradigm. Our own experience covers only a small subset of the problems that can be solved with neural nets.

(a) Analysis of Seismic Signals.

One of the important problems of post cold war politics is the problem of verifying compliance with nuclear test ban treaties. Back propagation on feed forward networks, unsupervised self-organizing networks, radial basis function networks, probabilistic networks, as well as adaptive resonance techniques have been used by members of our team to discriminate underground nuclear explosions from earthquakes by analyzing far-field seismic signals. We were also successful in deducing seismic parameters such as the depth of an event, as well as dip and slip. Although all ANN methods gave results comparable or better than conventional techniques, the "conjugate gradient back propagation with weight elimination" made classification predictions with consistently better than 90% accuracy.

 

(b) Ill-posed Problems.

Many inversion problems of remote sensing can be formulated as Fredholm integral equations of the first kind. Inverse problems are difficult to solve not only because they are ill-posed in the Hadamard sense, but also because the associated matrices are ill-conditioned. By using the sum of the squared errors as the energy function of a Hopfield net, we were able to invert a variety of poorly conditioned matrices arising out of Fredholm equations. We are in the process of refining and applying this technique to deduce ozone profiles from satellite data and to deduce atmospheric aerosol concentrations from data gathered from Guidestar experiments being conducted at the Lawrence Livermore National Laboratory.

 

(c) Time-series Prediction.

Chaotic phenomena abound in nature. Because there is often a well defined deterministic generating process behind the observed chaos, it should be possible to make short range predictions of their behavior; it is well known that long range predictions are not possible. We were able to train back propagation networks to make short range predictions of well-known chaotic time series like the Mackey-Glass series. This work has important implications in such disparate areas as digital communications and stock portfolio management. We are in the process of applying this method to identify the internal structure of a digital shift register by studying the pseudo-random bit sequence generated by the system.

 

(d) Modeling and System Identification.

Another problem that is akin to (b) and (c) is the identification of mathematical models on the basis of experimental measurements. If the model structure is not clear at the outset, non-parametric identification procedures can play a useful role. Here, instead of identifying the physical parameters of the system, one simply develps a model that fits the observed experimental data. We are currently experimenting with the use of recurrent neural nets as well as nets with dynamic neurons in solving this problem.

 

(e) Troubleshooting and Diagnostics.

In this project we are trying to using neural nets for on-line troubleshooting and diagnosis of data processing equipment. Here a neural net is being used to categorize a troubleshooting problem prior to searching a database of prior cases of a similar nature.

 


vemuri1@llnl.gov
Monday the 11th, Decemberr 1995