Object and Concept Recognition for Content-Based Image Retrieval


Principal Investigator

Linda G. Shapiro
Department of Computer Science and Engineering
University of Washington
Box 352350
Seattle WA 98195-2350



image retrieval
object recognition
image classification
hierarchical multiple classifiers

Project Summary

With the advent of powerful but inexpensive computers and storage devices and with the availability of the World Wide Web, image databases have moved from research to reality. Content-based image retrieval is not yet a commercial success, because most real users searching for images want to specify the semantic class of the scene or the object(s) it should contain. To this end, the goal of this research is to develop the necessary methodology for automated recognition of generic object and concept classes in digital images. The work builds on existing object-recognition techniques in computer vision for low-level feature extraction and designs higher-level relationship and cluster features plus a new unified recognition methodology to handle the difficult problem of recognizing classes of objects, instead of particular instances. Local feature representations and global summaries that can be used by general-purpose classifiers are being developed. A powerful new hierarchical multiple classifier methodology will provide the learning mechanism for automating the development of recognizers for additional objects and concepts. The results of this work will be a new generic object recognition paradigm that can immediately be applied to automated or semi-automated indexing of large image databases and will be a step forward in object recognition.

Publications and Products

Y. Li and L. G. Shapiro, "Consistent Line Clusters for Building Recognition in CBIR," Proceedings of the International Conference on Pattern Recognition, August 2002.

Y. Li and L. G. Shapiro, "Object Recognition for Content-Based Image Retrieval," in Lecture Notes in Computer Science, Springer-Verlag, to appear, 2004.

Project Impact

The results of this project will have an impact on both image retrieval from large databases and object recognition in general. It will target the recognition of classes of common objects that can appear in image databases of outdoor scenes. It will develop object class recognizers and a new learning formalism for automating the production of new classifiers for new classes of objects. It will also develop new representations for the image features that can be used to recognize these objects. It will allow content-based retrieval to become an important method for accessing real, commercial image databases, which today use only human index terms for retrieval.

Goals, Objectives and Targeted Activities

In the first year of the grant, we developed the feature extraction routines to extract features capable of recognizing an initial set of common objects representing a variety of the types of objects that appear in outdoor scenes, including city scenes and noncity scenes. We designed generic object recognition algorithms for the initial object set. We have developed such algorithms for vehicles, boats, and buildings, and have designed new high-level image features including symmetry features and cluster features. In the second year, We designed a unified representation for the image features called abstract regions. These are regions of the image that can come about from many different processes: color clustering, texture clustering, line-segment clustering, symmetry detection, and so on. All abstract regions will have a common set of features, while each different category will have its own special features. Our current emphasis is on using abstract features along with learning methodologies to recognize comon objects.

Area Background

The area of content-based image retrieval is a hybrid research area that requires knowledge of both computer vision and of database systems. Large image databases are being collected, and images from these collections made available to users in advertising, marketing, entertainment, and other areas where images can be used to enhance the product. These images are generally organized loosely by category, such as animals, natural scenes, people, and so on. All image indexing is done by human indexers who list the important objects in an image and other terms by which users may wish to access it. This method is not suitable for today's very large image databases.

Content-based retrieval systems utilize measures that are based on low-level attributes of the image itself, including color histograms, color composition, and texture. State-of-the-art research focuses on more powerful measures that can find regions of an image corresponding to known objects that users wish to retrieve. There has been some success in finding human faces of different selected sizes, human bodies, horses, zebras and other texture animals with known patterns, and such backgrounds as jungles, water, and sky. Our research will focus on a unified methodology for feature representation and object class recognition. This work will lead to automatic indexing capabilities in the future.

Area References

A. Berman and L. G. Shapiro, "A Flexible Image Database System for Content-Based Retrieval," Computer Vision and Image Understanding, Vol. 75, Nos. 1-2, pp. 175-195, 1999.

D. A. Forsyth, J. Malik, M. M. Fleck, H. Greenspan, T. Leung, S. Belongie, C. Carson, and C. Bregler, "Finding pictures of objects in large collections of images," Proceedings of the 2nd International Workshop on Object Representation in Computer Vision, 1996.

M. Flickner, H. Sawhnew, W. Niblack, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steel, P. Yanker,"Query by image and video content: the QBIC system," Computer, pp 23-32, Vol 3, number 9, 1995.

H. Murase and S. K. Nayar, "Visual Learning of Object Models from Appearance," International Journal of Computer Vision, 1992.

C. Papageorgiou and T. Poggio, "A Pattern Classification Approach to Dynamical Object Detection," International Conference on Computer Vision, pp. 1223-1228, 1999.

Y. Rui and T. S. Huang, "Optimizing Learning in Image Retrieval," Proceedings of the IEEE Conference on Computer Vision and Patter Recognition, pp. 236-243, 2000.

L. G. Shapiro and G. C. Stockman, Computer Vision , Prentice Hall, Upper Saddle River, NJ, 2001.

A. Vailaya, M. Figueiredo, A. Jain, and H. J. Zhang, "Content-Based Hierarchical Classification of Vacation Images," Proceedings of IEEE International Conference on Multimedia Computing and Systems, 1999.

Project Websites

This is the main website for our project.


This page shows demos of various feature extraction and object recognition algorithms. Our older efficient content-based retrieval system is also available to try.

Online Data

Groundtruth Database: http://www.cs.washington.edu/research/imagedatabase/groundtruth/
Our groundtruth database consists of 21 datasets of outdoor scene images, many including a text file containing a list of visible objects for each image.