The Kernel Discriminant-EM Algorithm

In the practice of supervised learning in vision applications, we meet some problems. One of these problems is that samples are too many to label (face recognition), too expensive to label (head pose recognition), or impossible to label (image retrieval). However, unlabeled samples are easy to obtain in many cases. Another problem is that a complete set of good physical features (or models) is hard to extracted even by domain knowledge, since image space is huge and rich. How to automatically select best features, switch among features or combine these features is still a problem. Our proposed approach, Discriminant-EM algorithm (D-EM), includes an unlabeled training set in supervised learning. Through a linear transformation, the original feature space is mapped to a new feature space, in which the ratio of between-class scatter to within-class scatter is maximized by multiple discriminant analysis. A parametric generative model is used to model the distribution in the mapped feature space whose dimension has been largely reduced. The proposed D-EM algorithm not only estimates the parameters of the generative model, but also estimates the linear transformation. The role of the linear transformation is to construct a new set of features that is "best" for the classification. Through this mapping, samples are clustered in the new features space, in which the assumption of one-one component-class correspondences is always valid in many cases, so that the D-EM will converge to a good estimate. Another advantage of D-EM is that automatic feature selection is achieved. The proposed D-EM algorithm can be applied to a lot of applications. We employed D-EM in image retrieval application and hand posture recognition. It achieved promising results.

D-EM is limited by its dependence on linear transformations. This paper extends the linear D-EM to nonlinear kernel algorithm, Kernel D-EM, based on kernel multiple discriminant analysis (KMDA). KMDA provides better ability to simplify the probabilistic structures of data distributions in a discrimination space. We propose two novel data-sampling schemes for efficient training of kernel discriminants. Experimental results show that classifiers using KMDA learning compare with SVM performance on standard benchmark tests, and that Kernel D-EM outperforms a variety of supervised and semi-supervised learning algorithms for a hand-gesture recognition task and fingertip tracking task.


Publication:
  1. Qi Tian, Ying Wu, Jerry Yu and Thomas S. Huang, "Self-Supervised Learning Based on Discriminative Nonlinear Features and Its Applications for Pattern Classification", Pattern Recognition, Vol.38, No.6, 2005.   [PDF]

  2.  
  3. Ying Wu and Thomas S. Huang, "Towards Self-Exploring Discriminating Features for Visual Learning", Journal of Engineering Application on Artificial Intelligence, Vol.15, pp.139-150,2002.   [PDF]

  4.  
  5. Qi Tian, Jerry Yu, Ying Wu, and Thomas S. Huang, "Learning Based on Kernel Discriminant-EM Algorithm for Image Classification", in Proc. IEEE Int'l Conf. on Acoustics, Speech, and Signal Processing (ICASSP04), Montreal, Canada, May 2004.   [PDF]

  6. Ying Wu, Kentaro Toyama and Thomas S. Huang, "Self-Supervised Learning for Object Recognition Based on Kernel Discriminant-EM Algorithm", in Proc. IEEE Int'l Conf. on Computer Vision (ICCV'01), Vol.I, 275-280, Vancouver, Canada, July, 2001.   [PDF]

  7.  
  8. Ying Wu, Qi Tian, Thomas S. Huang, "Discriminant-EM Algorithm with Application to Image Retrieval", In Proc.  IEEE Conf. on Computer Vision and Pattern Recognition (CVPR'2000), Hilton Head Island, SC, June, 2000.   [PDF]

  9.  
  10. Ying Wu, Thomas S. Huang, "View-independent Recognition of Hand Postures", In Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR'2000), Hilton Head Island, SC, June, 2000.   [PDF]

  11. Ying Wu, Thomas S. Huang, " Self-Supervised Learning for Visual Tracking and Recognition of Human Hand", In Proc. AAAI 17th National Conf. on Artificial Intelligence (AAAI'2000), Austin, TX, July, 2000.   [PDF]
Return to Video Research
Updated 09/2005. Copyright © 2001-2006 Ying Wu