Efficient Content-Based Image Retrieval

PI: Linda G. Shapiro
Department of Computer Science and Engineering
University of Washington

Contact Information

Linda G. Shapiro
Department of Computer Science and Engineering
University of Washington
PO Box 352350
Seattle, WA 98195-2350
Phone: (206) 543-2196
Fax : (206) 543-2969
Email:  shapiro@cs.washington.edu

WWW PAGE

Project URL

List of Supported Students

Project Award Information

  • IRI-9711771
  • 09/15/1997 -- 08/31/2000
  • Efficient Content-Based Image Retrieval

    Keywords

    content-based image retrieval, image database, image indexing, image matching, distance measures

    Project Summary

    The focus of our work is the development of a general, scalable architecture to support fast querying of very large image databases with user-specified distance measures. We have developed algorithms and data structures for efficient image retrieval from large databases with multiple distance measures. We are investigating methods for merging our general, distance-measure-independent method with other useful techniques that may be distance measure specific, such as keyword retrieval and relational indexing. We are developing both new methods for combining distance measures and a framework in which users can specify their queries without detailed knowledge of the underlying metrics. We have built a prototype system to test our methods and evaluated it on both a large general image database and a smaller controlled database.
     
     
     

    Publications and Products

    A. Berman and L. G. Shapiro. "Efficient image retrieval with multiple distance measures." Proceedings of the SPIE Conference on Storage and Retrieval for Image and Video Databases, February, 1997.

    A. P. Berman and L. G. Shapiro. "Selecting good keys for triangle-inequality-based pruning algorithms." IEEE International Workshop on Content-Based Access of Image and Video Databases, January 1998.

    A. P. Berman and L. G. Shapiro. "A Flexible Image Database System for Content-Based Retrieval." 17th International Conference on Pattern Recognition, August, 1998.

    A. P. Berman and L. G. Shapiro. "Triangle-Inequality-Based Pruning Algorithms with Triangle Tries."Proceedings of the SPIE Conference on Storage and Retrieval for Image and Video Databases, January, 1999.

    A. P. Berman and L. G. Shapiro, "A Flexible Image Database System for Content-Based Retrieval," Computer Vision and Image Understanding, Vol. 75, Nos. 1-2, 1999, pp. 175-195.

    A. P. Berman and L. G. Shapiro, "Efficient Content-Based Retrieval: Experimental Results," Proceedings of the IEEE Workshop on Content-Based Access of Image and Video Databases, June 1999, pp. 55-61.

    Demo Flexible Image Database System Using the Groundtruth Database

    Groundtruth Database
     

    Goals, Objectives, and Targeted Activities

    In the first two years of the project, we developed a prototype system, FIDS, with which the user can conduct searches using complex combinations of several dozen distance measures. We incorporated a new set of data structures and algorithms into the latest version of FIDS that has led to a marked speed up of the elimination phase of the search in large image databases. Currently, FIDS contains over 37,000 images. A complex search through the database can be performed in just a few seconds. Furthermore, we have created a ground-truth database that currently contains 11 datasets of about 48 images each, plus an ASCI file for each dataset that lists the names of the objects that appear in each image. Our web demo system uses this groundtruth database, to which both we and other researchers are adding data and associated descriptions. This database is meant for use in classification and object recognition retrieval experiments.

    Our goals for the current year are related to two new aspects of the work:

    Since manual keyword indexing is very tedious and relational indexing requires image analysis to find regions of interest, we are concentrating on automatic segmentation and object recognition techniques. We are developing techniques for recognition of such objects as vehicles, boats, office buildings, and houses. The goal is to produce a large set of very different features from standard color and texture features to regions from different kinds of segmentation and linear features such as line segments and circles. The features and relationships among them will be combined into a large feature vector that will be the input to a new hierarchical learning technique that was the subject of Yu-Yu Chou's 1999 dissertation at the University of Washington.

    Project Impact and Output

    The results we have obtained will be used to make image retrieval faster, easier, and more useful. The ideas should translate to use in complex multimedia information systems, making them useful to the scientific community as a whole, laymen with scientific interests, and students at all levels.

    This project has supported the Ph.D. research of Andrew Berman, who received his degree in March, 1999 and has partially supported the work of Yu-Yu Chou, who received his Ph.D. in December, 1999. Three undergraduate students, Eva Brezin, Kent Schliter, and Marsha Eng, participated in the project during the summers. Yi Li, a 2nd year CSE graduate student, is now supported by the project. Meanwhile, Andrew Berman has founded his own company, QueryPlus, in New Jersey and is working on commercial applications of the funded work. Yu-Yu Chou has just joined Numeritech in San Jose as their senior software engineer.

    Area Background

    The area of content-based image retrieval is a hybrid research area that requires knowledge of both computer vision and of database systems. Large image databases are being collected, and images from these collections made available to users in advertising, marketing, entertainment, and other areas where images can be used to enhance the product. These images are generally organized loosely by category, such as animals, natural scenes, people, and so on. Their access is dependent on a user being willing to browse large collections in order to select appropriate images.

    Researchers in computer vision and computer graphics have developed image distance measures that can compare a sample image or sketch provided by a user to the images in the database and retrieve those that are judged similar by the measure being used. Commercial systems like QBIC and Virage utilize measures that are based on low-level attributes of the image itself, including color histograms, color composition, and texture. State-of-the-art research focuses on more powerful measures that can find regions of an image corresponding to known objects that users wish to retrieve. There has been some success in finding human faces of different selected sizes, human bodies, horses, zebras and other texture animals with known patterns, and such backgrounds as jungles, water, and sky.

    Standard database systems, whether they be relational or object-oriented, depend heavily on the ability to index the data according to keywords or key phrases that are stored in the data. While images can be retrieved in this way, it requires human classifiers to look at each image and select a suitable set of keywords. Even if this could be done for millions of images, it would be insufficient, as the keywords would only be one person's ideas of the concepts in that image. Instead, both keywords and a large and powerful set of image distance measures are needed.

    Area References

    J. Barros, J. French, W. Martin, P. Kelley, and M. Cannon, "Using the triangle inequality to reduce the number of comparisons required for similarity-based retrieval," IS&T/SPIE- Storage and Retrieval for Still Image and Video Databases, Volume IV, (1996).

    D. A. Forsyth, J. Malik, M. M. Fleck, H. Greenspan, T. Leung, S. Belongie, C. Carson, and C. Bregler, "Finding pictures of objects in large collections of images," Proceedings of the 2nd International Workshop on Object Representation in Computer Vision, (1996).

    T. Kato, T. Kurita, N. Otsu, K. Hirata, "A sketch retrieval method for a full color image database," 11th International Conference on Pattern Recognition, pp 530-533, (1992).

    A. Del Bimbo, M. Campanai, P. Nesi, "3D visual query language for image databases," Journal of Visual Languages and Computing, Vol 3, (1992).

    A. Gupta, "Visual information retrieval: a Virage perspective," white papaer available on the World Wide Web, http://www.virage.com/literature/wpaper.html, (1995).

    M. Flickner, H. Sawhnew, W. Niblack, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steel, P. Yanker,"Query by image and video content: the QBIC system," Computer, pp 23-32, Vol 3, number 9, (1995).

    A. Pentland, R. W. Picard, S. Sclaroff, "Photobook: tools for content-based manipulation of image databases," Technical Report, Volume 255, MIT, Media Lab., (1993)

    R. K. Srihari, "Automatic indexing and content-baseds retrieval of captioned images," IEEE Computer, Volume 28, number 9, pp 49-56, (1995).