Abstract
A new method of tracking the position of important facial features for semantic-based moving image coding is presented. Reliable and fast tracking of the facial features in head-and-shoulders scenes is of paramount importance for reconstruction of the speakers motion in videophone systems. The proposed method is based on eigenvalue decomposition of the sub-images extracted from subsequent frames of the video sequence. The motion of each facial feature (the left eye, the right eye, the nose and the lips) is tracked separately; this means that the algorithm can be easily adapted for a parallel machine. No restrictions, other than the presence of the speaker's face, were imposed on the actual contents of the scene. The algorithm was tested on numerous widely used head-and-shoulders video sequences containing moderate head pan, rotation and zoom, with remarkably good results. Tracking was maintained even when the facial features were occluded. The algorithm can also be used in other semantic-based systems
Original language | English |
---|---|
Pages (from-to) | 257-263 |
Number of pages | 7 |
Journal | IEE Proceedings on Vision, Image and Signal Processing |
Volume | 145 |
Issue number | 4 |
DOIs | |
Publication status | Published - 1998 |