Projects
¯x§Î
 Home  People  Projects   Publications  Seminars  Demos  Download  Other Links

Research Projects:

¹Ï¹³

Active Vision System Calibration Project
 ¡X
YG1 : IIS binocular head
 ¡X
YG2 : HelpMate BiSight

¹Ï¹³


Articulated Object Recognition

¹Ï¹³


Image-Based 3D Modeling and Augmented Reality

¹Ï¹³


Image-Based Rendering

¹Ï¹³


Motion Estimation and Visual Tracking

¹Ï¹³


MPEG 4 Coding

¹Ï¹³

Security Systems
 ¡X
Face Recognition                                       ¡X Signature Verification System
 ¡X
Face Detection

¹Ï¹³


Shape Similarity

¹Ï¹³


Sparse Representations for Image Decompositions

¹Ï¹³

Wavelet-Based Image Analysis
 ¡X
Edge Detection                                     ¡X Wavelet-Based Shape form Shading
 ¡X
Wavelet-Based Image Registration          

 

1. Active Vision System Calibration project
(1)
YG1: IIS binocular head

We built this binocular head to investigate some active vision problems, e.g., 3D model/environment reconstruction and 3D tracking. It includes twoprismatic joints (X-Y table) and six revolute joints (pan, tilt, left verge, left focus, right verge, and right focus). We developed a four stage (motorized lens camera calibration, kinematics calibration, head/eye calibration, and global kinematic refinement) process to calibrate this binocular head and achieve accuracy of one pixel prediction error and 0.2 pixel epipolar error.



(2)YG2: HelpMate BiSight

Our YG2 binocular vision system includes HelpMate BiSight platform and two tailor-made motorized lens. It uses 10 servo motors as axes of pan, tilt, and left/right verge, focus, zoom, and aperture. We are now calibrating this binocular head.

 
2. Articulated Object Recognition

I. Background:

We investigate the problems of recognizing and finding articulated and deformable objects. In particular we study human arm and leg articulations, restricting the class of objects to the one that can be represented as contours. Our aims are applications to aid video editing, database content retrieval, animation and medical imaging. In real world, articulated shapes are encountered almost everywhere. However, most of the  (invariant-based) object recognition systems do not have the capability to recognize articulated or deformable
targets. Methods based on multiple views are interesting but, can not account for large deformations or articulations.  They lack a method that can realize the similarity between an object and its articulated/deformed counterparts.

II. Our Approach:

We propose a deconstruction framework to recognize and find articulated objects.  The deconstruction view of recognition naturally decomposes the problem of finding an object in an image, into the one of (1) extracting key features in an image, (2) detecting key points in the models, (3) segmenting an image, and (4) comparing shapes. All of these subproblems can not be resolved independently. Together, they reconstruct the object in the image.

III. Demo:

As an illustration, please refer to the  demo.
 

             
3.Image-based 3D Modeling and Augmented Reality

This research focus on the reconstruction of 3D object models from images and the composition of virtual objects and real images/videos.  By using the computer vision techniques, we investigate the problem of how to
build photo-realistic 3D object or environment models which can be applied for virtual reality or computer graphics.  We have plentiful experiences on camera calibration, calibration of an active binocular head, stereo vision, 3D acquisition using an active binocular head, range finder using structured lighting, and 3D registration.  By developing new camera calibration, 3D modeling, and model-based tracking techniques, we also investigate the problem of how to integrate virtual graphical objects into real images or videos for
augment reality.  An additional issue of this project is the development of new interactive tools to allow some related 3D tasks for multimedia applications (such as those in MPEG 4) to be easier.
Previous results include :

  • Hierarchical  approach for DTM generation ( paper download )
  • Laser scanning and active color-stereo techniques for range data acquisition
  • Registration and integration of range images for 3D object Modeling ( paper download : [1] [2] )
  • Reconstruction of 3D environment model with an active binocular head
  • Augment reality via model-based camera pose re-estimation and tracking ( paper download : [postscript]  [word] )

 Current research emphases :

  • Augmented reality via camera self-calibration
  • Model-based tracking algorithm for augmented reality
  • Interactive laser scanning system
     

 

4.Image-Based Rendering

Image-Based Rendering technique provides easy constructing photo-realistic virtual scene tools. For example, the panoramic imaging system provides a simple way to let users look around the surrounding scene from a viewpoint; the object movie system let users observe an interesting object from almost any arbitrary direction. However, the traditional image-based rendering technique is not easy to provide plenty interaction for users to navigating within a virtual world, such as smooth transition between viewpoints within a panorama system, and also not easy to provide stereo views.

In our laboratory, we have developed a useful technique, the Disparity Morphing, for image-based rendering systems. We have successfully applied the disparity morphing technique to several image-based rendering systems, such as the Stereo Panoramic Imaging SYstem (SPISY) and the Object Movie System, to enhance the realistic and interaction capability of those systems.

Projects under the image-based rendering research topic:

 

5.Motion Estimation and Visual Tracking

In our lab, we combine some fast algorithms, such as adaptive early-jump-out method and globally optimal template matching, with a hierarchical structure for motion estimation.  Optical flow estimated by using a SRT model and a robust two-stage approach is used to ego-motion estimation.

A 3D feature-based tracker for tracking multiple moving objects in the IIS-head was developed.  We also successfully developed a free-hand pointer based on hand tracking and a human head tracker based on automatic detection and tracking of human heads.  Recently, we develop a model-based tracker based on robust estimation technique, and then apply it in the project of augmented reality.
 


6.MPEG-4 Coding

MPEG-4 is the standard for multimedia applications. MPEG-4 provides interfaces to integrate and interact with many types of information, such as videos, audio, graphics objects, panoramas, 2D and 3D mesh objects animation, face and body animation, SNHC, etc...

MPEG-4 also provides a large amount of research topics about the audiovisual processing and communication. The most interesting MPEG-4 related research topics for our laboratory include: How to code video data efficiently; How to segmentation video objects; How to identify background objects during video conferencing for sprite coding; How to generate animation mesh for foreground objects during video conferencing; How to generate 3D objects from stereo images; And how to generate face object information combined with head-tracking system .

Current Research Projects under the MPEG 4 Coding research topic include: 

  • MPEG-4 Codec Construction
  • MPEG-4 Video Object Segmentation
  • MPEG-4 Sprite Coding (Panorama & Conferencing)
  • MPEG-4 3D Object Coding & Augmented Reality
  • MPEG-4 Face Object Coding

 

7.Security Systems

(1) Face Recognition

In this project, a coarse-to-fine, LDA-based face recognition system is proposed. Through careful implementation, we found that the database adopted by two state-of-art face recognition systems were incorrect because they mistakenly use some none-face portions for face recognition. Hence, a face-only database is used in the proposed system. Since the facial organs on a human face only differ slightly from person to person, the decision-boundary determination process is tougher in this system than it is in the conventional approached. Therefore, in order to avoid the above mentioned ambiguity problem, we propose to retrieve a closest subset of database samples instead of retrieving a single sample. The proposed face recognition system has several advantages. First, the system is able to deal with a very large database and can thus provide a basis for efficient search. Second, due to its design nature, the system can handle the defocus and noise problems. Third, the system is faster than the autocorrelation plus LDA approach and the PCA plus LDA approach, which are believed to be two statistics-based, state-of-art face recognition systems.

(2) Face Detection

A useful geometrical face model and an efficient facial feature detection approach are proposed. Based on the fact that human faces are constructed in the same geometrical configuration, the proposed approach can accurately detect facial features, especially the eyes, even when the images are complex backgrounds. The average computation time for one image of size 512x340 is less than 5 sec. By a SUN-Sparc 20 workstation. Experimental results demonstrate that the proposed approach can efficiently detect human facial features and satisfactorily deal with the problems caused by bad lighting conditions, skew face orientation, and even facial expression.

(3) Signature Verification System

In this project, a wavelet-based off-line signature verification system is proposed. The proposed system can automatically identify useful and common features which consistently exist within different signatures of the same person and, based on these features, verify whether a signature is a forgery or not. The system starts with a closed-contour tracing algorithm. The curvature data of the traced closed contour are decomposed into multiresolutional signals using wavelet transforms. Then the zero-crossing corresponding to the curvature data are extracted as features for matching. Moreover, a statistical measurement is devised to decide systematically which closed contours and their associated frequency data of a writer are most stable and discriminating. Based on these data, the optimal threshold value which controls the accuracy of the feature extraction process is calculated. The proposed approach can be applied to both on-line and off-line signature verification systems. Experimental results show that the average success rates for English signatures and Chinese signatures are 91.71% and 93%, respectively.

 

8. Shape Similarity

I. Background:

Representing shapes is a significant problem for vision systems that must recognize or classify objects.  Methods to compare two shape contours based on evaluating global deformations tend to be sensitive
to occlusion and fail to account for local deformations (such as articulations) since these deformations may change the global appearance of objects considerably while the entire deformation is concentrated in specific points.  A class of methods compares objects by deforming one object into another and evaluating the amount of deformation applied in this process.  Guaranteed methods, typically, use dynamic programming  (time-warping) to register two contours. These are all string (contour) matching algorithms.  The main drawback of these approaches is that they do not account for region information and for symmetries.

II. Our Approach:

We derive a representation model for shapes. Given a shape, we investigate its self-similarities, and constructing its shape axis (SA) and shape axis tree (SA-tree).

We start with a shape, its boundary contour, and two different parameterizations for the contour. To measure its self-similarity we consider matching pairs of points (and their tangents) along the boundary contour, i.e., matching the two parameterizations.  The matching, or self-similarity criteria may vary, e.g., co-circularity,
parallelism, distance, region homogeneity. The loci of middle points of the pairing contour points are the shape axis and they can be grouped into a unique tree graph, an SA-tree. We are now working on the application of the SA model.


III. Demo:

As an illustration, please refer to the demo .
 

9. Sparse Representations for Image Decompositions

I. Background:

In this project, the problem of sparse representations for image decompositions is addressed.  Suppose we have an image I and a library of templates L where L is an overcomplete basis for I. The templates can represent objects, faces, features, analytical functions, or even single pixel templates.  There are infinitely many ways to decompose Ias a linear combination of the library templates.  Each decomposition defines a representation for the image I, given L. We are interested in the following questions.  What is an optimal representation for I given L and how to select it?

II. Our Approach:

We formulate the image representation problem as an optimization process. In particular, our modeling focuses on/requires :
(1) Sparse Representation: to represent (decompose) an image using as few templates as possible in order to have an economical (minimal) representation.

(2) Occlusions: to allow for partial occlusions, i.e., the cost of fitting a template must take into account that portions of the template may have a ''bad match ''.

(3) Noise: to model noise via ''noise templates'' or canonical template, accounting for the difference between the template fit and the image.

III. Demo:

As an illustration, please refer to the demo.
 

10.Wavelet-based Image Analysis

(1)
Edge Detection

This project proposes a new wavelet-based approach to solving the edge detection problem. The proposed scheme adopts Canny`s three criteria as a guide to derive a wavelet-style edge filter such that the edge points of an image can be detected directly and accurately at different scales. Since Canny`s criteria are suitable for those edge detectors that detect local extremes, the desired wavelet is, therefore, chosen to be anti-symmetric. In order to obtain sufficient information for reconstructing and analyzing the original image, the dual of the desired wavelet is also required. Basically, the pair of wavelets is represented as a linear combination of translations of a scaling function. By introducing a constrained optimization process, the set of expansion coefficients of the desired wavelet and its dual as well can be determined. On order to implement the desired edge detector, a continuous wavelet has to be converted into the discrete form. For this purpose, the format of the discrete wavelet transform has to be developed. Since the proposed edge filter is wavelet-based, the inherent multiresolution nature of the wavelet transform provides more flexibility on the analysis of images. Also, since an optimization process is introduced in the filter derivation process, the performance of the proposed filter is better than that of Mallat-Zhong edge detector.

 

(2) Wavelet-based Shape from Shading

This project proposes a wavelet-based approach for solving the shape from shading (SFS) problem. The proposed method takes advantage of the nature of wavelet theory, which can be applied to efficiently and accurately represent ¡§thing¡¨,to develop a faster algorithm for reconstructing better surfaces. To derive the algorithm, the formulation of Horn and Brooks ((Eds.) Shape From Shading, MIT Press, Cambridge, MA, 1989), which combines several constraints into an objective function, is adopted. In order to improve the robustness of the algorithm, two new constraints are introduced into the objective function to strengthen the relation between an estimated surface and its counterpart in the original image. Thus, solving the SFS problem becomes a constrained optimization process.

 

(3) Wavelet-based Image Registration

In this project, we propose a new edge-based approach to efficiently deal with the image registration problem. The proposed method applies the wavelet transform technique to extract feature points from a partially overlapping image pair. By defining a similarity measure metric, the two sets of feature points can be compared, and the correspondences between the feature points can be established. Once the set of correctly matched feature point pairs between two images are found, the registration parameters can be derived accordingly. The proposed method can tolerate approximately 10% scaling variation and does not have to restrict the position and orientation of the input images.

¹Ï¹³
[Home] [People] [Projects ] [Publications] [Seminars] [Demos] [Download] [Other Links]
¹Ï¹³