NEURAL NETWORKS IN ROBOTICS
Nowadays there is certain class of topical tasks, solution of which is impossible or difficult to carry out without use of artificial neural networks (ANN). Often these tasks include classification, prediction and control of complex systems. Recently the concept of so-called deep learning has been gaining popularity. As a matter of fact, this concept doesn’t have anything revolutionary, works on study of artificial neural networks have been carried out since the middle of the last century. However, recently the performance level of personal computers and the development of parallel computational architectures allow wide circle of researches to apply these structures more effectively in the area of machine learning. This, in it turn, stimulates another raise of interest to neural networks. There are certain reasons to think that this leap will be significant and will determine concepts of the development of machine learning technologies in the future.
Initially we didn't set a goal to apply exactly neural network machine learning technologies but in the process of work we have come to understanding that without the application of these technologies it’s impossible to create comprehensive control systems for the solution of wide class of modern tasks. Relating to robotic applications neural networks are used more often in the tasks of artificial vision and in complex motion where application of usual methods became irrational. Using of neural network control in robotics allows cancelling numerous existing technical limits such as minimal rigidity of construction elements, strictly defined trajectories of movements and so on, but primarily it allows refusing from the necessity of accurate formal description of the surrounding space. From the other hand, for the advanced artificial vision (using the technology of image recognition, in particular) there were no better alternatives of information processing than application of neural network approaches. Above all, this allows carrying out of control in relative uncertainty and (or) dynamically changing surrounding space. That is equally fair for consumer robotics as well as for industrial devices and other devices. It’s should be noted in particular that neural network control opens wide possibilities for the application of technical control in those spheres where presence of a man needed.
In this section we won't try to cover the whole range of application of neural network and will make the main accent on the questions of technical vision realization and motion control. These two application areas are closely connected with each other because often these tasks have to be solved simultaneously during the control of the majority of robotic platforms. It is noteworthy that to solve such from the first sight different tasks, we can manage to apply approaches that are similar by their concepts. Existing level of technologies has allowed implementation of neural network solutions not only on high-performance platforms but also on the microprocessor-based devices in the embedded applications.
In general, neural network is a computer model functioning by the principle of biological neural networks and carried out on the classical computers with von Neumann architecture with consecutive performance of commands or on graphic processors, for example on CUDA architecture. There is big amount of developments and researches directed on the creation of specialized hardware architectures but due to the series of reasons such as low price and wide availability of traditional microprocessors and also their universality, nowadays specialized architectures are applied limitedly.
During the creation of a neural network model, engineers and researchers set themselves different tasks. Also, the model itself, i.e., its precision, the principle of learning and other parameters strongly depend on these purposes. For example, the neuroscientists pursue the goal of studying of different cognitive structures of real biological matters work (The Blue Brain project, for ex.), controls engineers search for and adapt operating concepts that can be harmoniously integrated into the control systems. We offer to pay attention to the practical side of the use of neural networks, because this allows to smooth the gap between practical engineering and theoretical developments in this field.
Of the total number of known network architectures we can distinguish two main classes - feedforward networks and recurrent networks, i.e. networks with feedback. In the feedforward networks neurons, from a logical point of view, are arranged in layers, and the information is transmitted between the layers in one direction, moving through a certain number of intermediate layers from receptors (sensory cells) to effectors (actuators). Advantage of this type of networks is a good instrumentarium of learning techniques. The class of such networks includes: Rosenblatt perceptron, multilayer perceptron of Rumelhart, cognitron, neocognitron, convolutional networks.
Let's consider a neural network model as a three-dimensional model of a set of formal neurons, arranged randomly in the space. At this stage, number of neurons or their types or principles of functioning don't matter, the only important thing is that each network element has certain coordinates in three-dimensional space and the information about the location in the space is used for the formation of goals, learning and operation of the network. Another important fact is that having the information about coordinates of the elements, you can at any time produce visualizations that is clearly necessary to understand the processes taking place in the model. Each formal neuron may be connected to many other neurons by conventional synaptic connections. These connections serve for transmitting of signals between neurons, and their number may vary greatly depending on the type of information being processed and type of neurons.
By the analogy with the usual elements of automatic control systems (for example, PID-regulators) each neuronal network is capable of performing a certain logic function and is relatively simple in its implementation. However, when combining these elements in a network, the network has ability to perform complex control functions, for which, in fact, neural network is applied. Engineers will understand obvious analogy, where each network element is in the form of classic PID controller. Difference of this analogy is that each regulator has big amount of input ports. It should not interfere with the analogy, since in a classical controller an error signal is calculated on the basis of the two signals, in this case, the error calculation occurs based on a large number of signals using weighting coefficients for each of them. In our case, the classic controller can be viewed as a special case of conditional neuron. Moreover, unlike the classical regulators, conventional neurons possess a large variety of methods for calculating the error, and types of activation functions. The neuron activation function is dependence of the neuron output signal on the calculation result of the weighted sum of the input signals. By analogy with biological systems, there are a number of different activation functions: linear, integrating, threshold, sigmoid, and others, the activation function is selected depending on the type of network learning algorithm, and other factors.
The beginning of the research field of artificial neural networks can be considered the time when an American neurophysiologist Frank Rosenblatt offered the model of brain perception of information – perceptron and created in 1957 the first in the world neurocomputer under the name «Mark-1». Rosenblatt tried to understand general laws of information processing in biological systems and to transfer found principles to artificial systems.
Rosenblatt’s Perceptron is a structure consisting of three layers: receptors - sensory elements, hidden neurons - associative elements and effectors - output responsive elements. This network is a feedforward network and it is described in the main part of the Rosenblatt’s book (Russian translation of the book is available here). It is understood that each associative element is connected to all sensor elements, and each output element is connected to all associative elements. There are no other connections in this model. Each connection corresponds to its weighting factor, and the connections between the receptor and the associative elements are selected and fixed when creating the network randomly and they are not changed in the future during the learning. From the other hand, the connection between the output and the associative elements change in the learning process by the method of weights correction.
Perceptron learning process is composed of consecutive supply to the receptor elements (network input) learning samples and then correction of weights of synaptic connections depending on the value of the error between the desired and actual output of neural network. For the supervised learning it’s necessary to nave learning samples together with certain desired network response to the every element of this sample. Required length of learning depends on many parameters including the size of input vector, speed of weight coefficients change, desired accuracy of network operation and so on. Often, the criterion of learning process termination is taken as current value of an error measured on the test sampling or gradient of this error measurement.
There is a misconception that Rosenblatt’s Perceptron of can solve only linearly separable problems and for this reason they, for example, it can’t resolve a simple logic XOR function, but this is true only for a single layer perceptron, without associative elements. This chapter refers to the classical Rosenblatt’s Perceptron, which does not have this limitation. In fact perceptron architecture is suitable to solve any problem of classification, but that doesn’t mean that architecture of perceptron is always better than other architectures. Difficulties begin during the implementation stage and more detailed examination of specific problems. It turned out that the computational complexity of many actual problems in practice is irrationally big and negates the advantages of this approach. It was revealed that for the perceptron architecture it's typical simple memorization of the input images without their generalization, i.e., separation of the invariant features of recognizable images. Theoretically, it's possible to solve with it almost any problem, but only if you have an infinitely large learning set and infinitely large computing power. But, unfortunately, in practice, there are no such resources, so the use of perceptron in its pure form is limited by relatively simple tasks of classification.
The logical development of architecture of Rosenblatt‘s Perceptron was the architecture of the multilayer Rumelhart’s Perceptron, feature of this architecture is the presence of more than one learning layer of neural network. In Rosenblatt’s Perceptron, discussed earlier, there was only one learning layer, while the weights of synaptic connections were fixed and did not change in the learning process. Number of learning layers typically ranges from to 2 to 4. Another feature of the multi-layer Rumelhart’s Perceptron is using the back-propagation learning algorithm. The idea of back-propagation algorithm is a sequential propagation of the error signal from the output of the network to its input. In the process weights are changed proportional to the gradient of the error. For this reason, the threshold activation function of neurons must be differentiable as the learning process needs to calculate the partial derivative of the activation function. In most cases, sigmoid function or the hyperbolic tangent are used, the partial derivative of which is determined without computational difficulties. Detailed description of the back-propagation algorithm can be found here (Russian description is available here). The use of this learning algorithm does not protect against a possible falling into the local minimum. Moreover the classical implementation of the back-propagation algorithm has the so-called network paralysis, in which the learning process almost stops.
Multilayer Rumelhart’s Perceptron is inherent the same disadvantages as to perceptron of Rosenblatt, but multilayer architecture contributes to the possibility of reducing the computational complexity of the network by reducing the required number of synaptic connections.
Cognitron and neocognitron
As a result of research of the cerebral cortex region responsible for processing of visual information, Kunihiko Fukushima proposed a neural network model called cognitron. For the first time in the neural network models there have been proposed to use the principle of local features detection and hence sensor elements at the neural network input became considered not as abstract array of real numbers, but as a two-dimensional matrix, where the mutual arrangement of the sensor elements already affects the network. In other words, the work of the neural network model clearly includes information about the mutual spatial arrangement of input elements. This allowed the first layer to detect simple local image features such as edges, lines, etc., and the following layers of a more abstract processing of the information received. Cognitron is a hierarchical structure, where each subsequent layer has a more general process and its information field captures all the larger portion of the input of the receptor layer. It was found that such an organization structure of the network allows to make the network more invariant to different distortions of input image, which is an advantage over existing perceptron. Neocognitron consists of cascaded neurons of two types. The first type is a simple type (S-type) - It is responsible for the allocation of local features, the second type of neuron is a complex type (C-type) – It is used for the introduction of invariance and distortion of the processed visual signal.
The main feature of cognitron from the networks considered above is an introduction of principle of so-called local perception where every neuron of complex type is connected not with all but only with small group of previous layer neurons located closely. Detailed description of work of neocognitron is available here.
The further development of neural networks architectures mainly oriented to the application of technical vision was architecture of convolutional network. This type of network was offered by Yann LeCun, French specialist in the sphere of machine learning and technical vision (personal site of Yann LeCun is available via the link). Convolutional networks like neocognitron use consecutive layers of S and C-type neurons and at the output of the network the connected network like multilayer perceptron is additionally used.
In contrast to neocognitron standard method of back propagation of error is used for the learning. One more differential characteristic is use of the principle of shared weighting factors where neurons of S-type share weighting factors with other neurons of one layer; this approach allows reducing significantly the amount of free parameters of neuron network and deducing its resource requirements in respect of necessary size of memory. Application of shared weighting factors shows similarity of the computation process of S-type neurons values with classical operation of convolution of two-dimensional matrixes. This similarity is a reason the name of this type of neural networks.
Detailed description of the use of convolutional neural networks in the task of handwriting images recognition can be found here. Convolution networks are actively researched and today their use is no longer restricted to the range of technical vision. They begin to be applied in the areas of audio processing and controls area.
Let’s consider the group of neurons of the same type located in a space in a certain way. Such groups are characterized by the fact that their description doesn't require large amount of information. it is enough to specify the type of neuron used, method and the density of neurons filling space and coordinates of filled area. However, it means that all the properties of the neuron, determining its behavior in the formation, learning and operation, are set by the type of neuron, which remains unchanged for the considered case. This method allows you to easily manipulate the characteristics of the group.
Full function neural network can consist of one or of many groups, depending on the task. Thus, to encode a neural network, it is required to provide information about the type and location of groups, describe the types and characteristics of neurons in groups and provide information on the number and location of receptors and effectors. Receptors and effectors are specialized types of neurons, intended for input and output respectively. A conceptual aspect of this coding is that the neurons of a network group may and should cross in space, while one group of neurons can freely communicate with other neurons groups. This coding type helps to optimize information resources for the description of neural networks and as a result, makes the network structure well adapted for the automated optimization process.
Fig.2. Convolutional networks.
Fig.3. Spiking networks.