Traditionally, vision systems are open-loop sequential operations, which function with constant predefined parameters and have no interconnections between them. This approach has impact on the final 3D reconstruction result, since each operation in the chain is applied sequentially, with no information between the different levels of processing. In other words, low level image processing is performed regardless of the requirements of high level processing. In such a system, for example, if the segmentation module fails to provide a good output, all the subsequent steps will fail.
The basic diagram from which feedback mechanisms for machine vision are derived can be seen in the above figure. In such a control system, the control signal u, or actuator variable, is a parameter of an image processing operation, whereas the controlled, or state, variable y is a measure of processing quality.
In a robotic application, the purpose of the image processing system is to understand the surrounding environment of the robot through visual information. Usually, an object recognition and 3D reconstruction chain for robot vision consists of low and high levels of processing operations. Low level image processing deals with pixel wise operations aiming to improve the input images and also separate objects of interest from background. Both the inputs and outputs of the low level processing blocks are images. The second type of modules, which deal with high level visual information, are connected to low level operations through a feature extraction component which converts the input images to abstract data describing the imaged objects. The importance of the quality of results coming from low level stages is related to the requirements of high level image processing. Namely, in order to obtain a proper 3D virtual reconstruction of the imaged environment at a high level stage, the inputs coming from low level have to be reliable.
In order to derive a control strategy for a machine vision system, the following discrete nonlinear state-space representation model of the vision apparatus is suggested:
where is the state vector, is the actuator (input), is the output vector, is the state transition function and is the output function. represents the discrete time. Suppose that we have a control law:
the control problem is to find the optimal parameter which provides an output of desired, or reference, quality. Following the above reasoning, the closed-loop system:
has its equilibrium point parameterized by . Having in mind the high non-linearity of an image processing system, a control strategy based on extremum seeking is suggested. Thus, the goal of the feedback control system is to determine the optimal parameter as the minimum, or maximum, value of the state vector :.
The choice of this particular type of control method lies in the fact that, taking into account the non-linearity of an image processing system, it is difficult to determine reference values that could be applied to classical feedback structures. Hence, in the image processing control approach, the desired state of a vision system is given by the extremal values of the state vector. The proposed model is applied to the depth estimation processed.
S.M. Grigorescu, G. Macesanu, T.T. Cocias, D. Puiu and F. Moldoveanu "Robust Camera Pose and Scene Structure Analysis for Service Robotics", Robotics and Autonomous Systems, Elsevier, Netherlands, vol. 59, no. 11, pp. 889-909, 2011.
S.M. Grigorescu Robust Machine Vision for Service Robotics, Institute of Automation, University Bremen, Shaker-Verlag, Aachen, 2010.