3D computer vision,
photogrammetry
Neural networks
for vision-based driver assistance systems
Theory of neural
networks
3D computer vision, photogrammetry: Current research activities
Since April 2010, I am professor of image analysis at the Faculty of Electrical Engineering and Information Technology of Dortmund University of Technology, where my main research fields are in 3D computer vision, photogrammetry, and pattern recognition.
Lectures, Bachelor/Master/Diplom projects
From January 2000 to March 2010, I was working as a senior research scientist in
the Machine Perception group at Daimler Group
Research in Ulm, Germany. I was responsible for projects dealing with
3D computer vision and photogrammetric methods and pattern recognition
approaches for applications in the fields of industrial production (e. g.
quality inspection and human-robot interaction) and driver assistance systems.
From 2005 to 2010, I was a visiting lecturer at the
Faculty of Technology of
Bielefeld University.
Recognition of working actions based on a non-stationary
HMM framework. Top: Spatio-temporal 3D pose estimation of the hand-forearm
limb with the Shape Flow algorithm. Bottom: Results of action recognition.
Blue: transfer motion; red: screw_1; black: screw_2; green: clean; ochre:
plug; white: unknown action; GT: ground truth (from Hahn et al., HCRS 2009).
Long-term trajectory prediction of vehicles based on
multiple hypotheses (from Hermes et al., Oldenburger 3D-Tage
2009, IV 2009).
Recognition of working actions and long-term motion
prediction by classification of trajectories for human-robot
interaction (from Hahn et al., ICVS 2008).
3D reconstruction of a raw forged iron surface based on a
combined analysis of stereo, intensity, and polarisation features (top)
and relying on monocular photopolarimetric features alone (bottom, a pixel
corresponds to 0.30 mm) (from Wöhler and d'Angelo (2009),
International Journal of Computer Vision 81;
d'Angelo and Wöhler
(2008), ISPRS Journal of Photogrammetry and Remote Sensing 63; cf. also
d'Angelo and Wöhler,
PCV 2006, DAGM 2005).
Segmentation and 3D pose estimation of vehicles based
on stereo image analysis and optical flow estimation (from Barrois et al., IV 2009).
Spatio-temporal 3D pose estimation of the hand-forearm
limb for human-robot interaction with the Shape Flow algorithm (from Hahn et al., ICPR 2008).
Spatio-temporal 3D pose estimation of the hand-forearm limb
for human-robot interaction (from Barrois and Wöhler,
ICVS 2008).
3D tracking of the hand-forearm limb and the head-shoulder
contour for human-robot interaction (from Hahn et al., 3DIM
2007, Oldenburger 3D-Tage 2009).
Tracking of moving objects in the workspace of an industrial
robot (from Schmidt et al., ICVS
2007).
Real-world (left) and synthetically generated (right)
training samples for traffic sign recognition
(from Hoessler et al., ICVS
2007).
3D reconstruction at absolute scale by combination
of structure from motion and depth from defocus (from
Kuhl et al., DAGM 2006).
3D pose estimation of industrial parts (from von Bank et al., DAGM 2003).
3D reconstruction of a glue line on a non-planar surface
(from d'Angelo and Wöhler,
ICCVG 2004).
A related research interest of mine is the image-based 3D reconstruction of lunar surface regions.
Left: DEM of the lunar volcanic dome Cauchy ω
(from Wöhler et al.
(2006), Icarus 183). Right: DEM of the northern half of the lunar
crater Kepler, obtained based on a combined structure from motion and
shape from shading analysis of a sequence of the Smart-1 AMIE camera
(from d'Angelo and Wöhler
(2008), ISPRS Journal of Photogrammetry and Remote Sensing 63).
For detailed information please refer to the list of publications.
Download of image sequences and ground truth data
A result of technology transfer: SafetyEYE
A more application-oriented research project for which my colleague Dr. Lars Krüger and I are responsible has led to the development of the vision-based SafetyEYE system for three-dimensional surveillance of working areas in industrial production. This system has been created in cooperation between Mercedes-Benz production in Sindelfingen, DaimlerChrysler Group Research in Ulm, and the company Pilz GmbH & Co. KG, a specialist for safe automation.
SafetyEYE's trinocular camera sensor
General information about the functionality of SafetyEYE (cf. published press material):
The SafetyEYE system consists of three calibrated cameras which monitor
the protection area around a machine, e. g. an industrial robot, and two
high-performance industrial PCs. The implemented stereoscopic algorithms
determine the three-dimensional structure of the scene being surveyed.
As soon as a potentially hazardous situation is about to occur, the system
initiates the protective measures necessary to prevent an accident, either
by slowing down or by stopping the machine. An important advantage of the
SafetyEYE system is the fact that it can be installed quickly and efficiently.
While setting up a traditional safety system consisting of several components
such as metal fences, light barriers, and laser scanners may take as long as
one day, only a few hours are needed to configure SafetyEYE's three-dimensional
protection areas. For the future, it is intended to increase the system
capabilities towards a distinction between persons and objects. This will
be a step towards collaborative working environments in which persons and
machines are able to work simultaneously on the same workpiece.
SafetyEYE has received the Automation Award 2006 as the most outstanding product presented on the SPS Drives technology fair in Nürnberg 2006 along with further awards (GIT Sicherheit Award 2007, ISA Award 2007, Electrical Industry Awards 2007, ETOP Innovation Award). It is is among the five products nominated for the Hermes Award 2007 and has been nominated for the Deutscher Arbeitsschutzpreis.
Further details about the SafetyEYE system are given in the DaimlerChrysler Hightech Report 2/2006 (English version of the article on eMercedesBenz). For further information, see the Mercedes-Benz press release, the SafetyEYE product website featuring an illustrative video, and the website of the system supplier Pilz GmbH & Co. KG.
Shape from shading methods for industrial quality inspection
The application of shape from shading methods (similar to those used for generating planetary digital elevation maps) to three-dimensional surface reconstruction in the domain of industrial quality inspection is described in this article by Dr. Georg Wiora (Nanofocus AG) and me: "3D-Vision aus dem All für die Industrie. Die Eleganz der Shape-from-Shading-Methode." Inspect 01/2008, pp. 78-79. Download in PDF format and as e-paper.
Further activities
Media
Neural networks for vision-based driver assistance systems: Dissertation (PhD thesis)
From 1997 to 1999, I worked as a PhD student in the Machine Perception group at DaimlerChrysler Research Centre Ulm, Germany. My advisors were Prof. Dr. Joachim K. Anlauf (Institut für Technische Informatik, University of Bonn) and Prof. Dr. Jürgen Schürmann (Technische Hochschule Darmstadt, DaimlerChrysler Research and Technology).
Abstract
My thesis is about a neural network algorithm for image sequence analysis based on a time delay neural network (TDNN) architecture with spatio-temporal receptive fields and adaptable time delay parameters. The aim is to classify objects on temporal sequences of greyscale images and to estimate their motion behaviour.
The "traditional" TDNN concept for processing discrete-time input signals, characterized by the use of temporal receptive fields, is well known e. g. from applications in the field of speech recognition. In these TDNNs, however, the time delay parameters are restricted to integer values, and the temporal length of the receptive fields has to be determined by rather tedious manual adaption. The adaptable time delay neural network (ATDNN) architecture for image sequence analysis proposed in this thesis differs from the traditional TDNN approach in that the ATDNN is characterized by spatio-temporal receptive fields and real-valued time delay parameters and temporal receptive field extensions which are adapted during the training process instead of being imposed a priori. Hence, the ATDNN algorithm allows important properties of the network architecture to be determined by learning from examples instead of manual adaptation.
As a first toy example it is shown that on synthetic image sequences displaying elliptic spots of different orientation moving horizontally across the scene at several speeds, the ATDNN simultaneously manages to classify the shapes correctly as well as to estimate their speed and motion direction. A very interesting feature is the property that a network having learned a certain number of shape and motion classes is able to generalize to intermediate shapes and speeds it has never "seen" during training.
Within the framework of the Urban Traffic Assistant (UTA) project carried out by the Image Understanding Department of DaimlerChrysler Research and Technology, the ATDNN is furthermore applied to real-time recognition of overtaking vehicles on motorways as well as of pedestrians in the inner city environment. In this context the recognition task is regarded as a classification problem. The ATDNN turns out to yield a stable and robust recognition behaviour, even under difficult visibility conditions such as rainy or foggy weather. Furthermore, a comparison of the spatio-temporally "local" ATDNN structure to state-of-the-art "global" polynomial support vector machine (SVM) classifiers concerning computational complexity and memory demand, values of high relevance for hardware implementations, shows that the ATDNN is superior in both respects by up to several orders of magnitude.
My PhD thesis has been published as follows:
C. Wöhler.
Neuronale Zeitverzögerungsnetzwerke für die Bildsequenzanalyse
und ihre Anwendung in fahrzeuggebundenen Bildverarbeitungssystemen.
Dissertationsschrift, Mathematisch-Naturwissenschaftliche Fakultät
der Rheinischen Friedrich-Wilhelms-Universität Bonn, 2000.
VDI-Fortschritt-Berichte, Reihe 10, Nr. 645, VDI-Verlag, Düsseldorf,
2000.
A summary can be found in:
C. Wöhler, J. K. Anlauf.
Real-time object recognition on image sequences
with the adaptable time delay neural network algorithm -
applications for autonomous vehicles.
Image and Vision Computing, vol. 19, no. 9-10, pp. 593-618, 2001.
Result videos and images concerning pedestrian recognition in the urban environment with the ATDNN:
Short video sequence (acquired at 25 Hz video rate, only every 4th image is shown) displaying a pedestrian crossing the street. The green box denotes the region of interest (ROI) extracted by stereo image analysis, the lower half of which - containing the pedestrian's legs - is used for classifying the detected object. The sequence of small images at the top shows the lower half of the ROI in the current and the seven preceeding images of the sequence, respectively; this image sequence is used as an input to the ATDNN classifier. The triangular icon marks an object that is classified as a pedestrian by the ATDNN.
[click on the images for full resolution view, citation, and download of the corresponding paper]
For further results, see list of publications.
Theory of neural networks: Diplom thesis
My Diplom thesis is about the dynamics of online learning in two-layered neural networks with continuous units. The learning process in such networks is dominated by plateaus in the time dependence of the generalization error. Using tools from statistical mechanics to describe the training process in terms of a system of coupled non-linear differential equations, it is shown for a soft committee machine the existence of several fixed points of the dynamics of learning that gives rise to complicated behaviour, such as cascade-like runs through different plateaus with a decreasing value of the corresponding generalization error. Learning rate dependent phenomena are examined, such as splitting and disappearing of fixed points of the equations of motion. The dependence of plateau lengths on the initial conditions is described analytically, the results are confirmed by simulations.
Essential results can be found in the following publication:
M. Biehl, P. Riegler, C. Wöhler.
Transient dynamics of on-line learning
in two-layered neural networks.
J. Phys. A 29, p. 4769, 1996.