Providing a Coherent View of Diverse Distributed Information Resources

Gregory H. Leazer


Dept. of Information Studies
Graduate School of Education and Information Studies
University of California, Los Angeles


Contact Information:


226 GSE&IS Building,
Mailbox 951520
Los Angeles, CA 90095-1520
Phone: (310) 206-8135
=46ax : (310) 206-4460
Email: gleazer@ucla.edu

 

WWW PAGE


URL:
http://skipper.gseis.ucla.edu/faculty/gleazer/HTML/index.html

Project Award Information

List of Supported Students


Lisa Smith, Bill Childers, Frank Hoppe.

Keywords


Digital libraries, information retrieval, intertextual networks, textual identity, metadata, document representation.

Project Summary


New digital information resources necessitate the creation of new tools for resource discovery. Users must negotiate a large number of collections of information, such as new digital libraries and the collections of traditional libraries and museums. The distribution of resources into separate systems prevents optimal searching, and users must search multiple systems sequentially to perform exhaustive searches. A solution to this problem is to provide an integrated control of multiple collections. The focus of this project is to investigate user behavior in the selection of diverse information resources from distributed collections using qualitative field work methods. A prototype system which integrates various types of metadata and forms networks of linkages is used to determine how users select resources. The network of linkages will reflect the evolving textual identity of individual resources by forming networks of intertextual associations; an intertextual network includes, for example, a progenitor work and its derivations, such as successive editions and translations. The educational component of this project is to develop a broad interdisciplinary program that integrates various theories in information retrieval.

Publications and Products:


Leazer, G. H., and Smiraglia, R. P. (1999). "Bibliographic Families in the Library Catalog: A Qualitative Study." Library Resources & Technical Services 43: 191-212.

Leazer, G. H. and J. Furner. (1999). "Topological Indices of Textual Identity Networks." Knowledge: Creation, Organization and Use: Proceedings of the 62nd American Society for Information Science Annual Meeting 36: 345-58.

Smiraglia, R. P. and Leazer, G. H. (1999). "Derivative Bibliographic Relationships: The Work Relationship in a Global Bibliographic Database." Journal of the American Society for Information Science 50: 493-504.

Leazer, G. H. (Scheduled). "Applying the Concept of the Work to New Environments." To appear in The Future of Cataloging: The Lubetzky Symposium, scheduled for publication by the American Library Association in April, 2000.

Project Impact:

Goals, Objectives, and Targeted Activities


The objective of this project is to develop an experimental system that integrates diverse collections of resources, drawn primarily from libraries, archives and museums. In particular, representational metadata from diverse systems will be joined to form intertextual networks such as networks of citations, hypertexts and shared features. I am especially interested in forming textual identity networks-clusters of individual documents that are derived from a common progenitor, such as a text, its later editions and translations, or the relationships among various music scores and their performances. This last year I have been forming textual identity networks and developing various metrics for characterizing the structures of the networks. It is my belief that these metrics will provide useful evidence for information retrieval systems in returning a ranked output of documents that match a given query, as is applicable in several domains including web retrieval.

Education outcomes are described under "Project Impact."

Project References


See the papers listed above under "Publications and Products."

Area Background


Predating the advent of computers, early information retrieval systems arranged standardized descriptions of works in an order that would express a work's evolving textual identity over time. Early union catalogs allowed users to locate works without prior knowledge of the existence of a particular library or its collection. Such a system allowed people to search multiple collections simultaneously, and assembled the various "versions" of a particular work, allowing users to select the most appropriate manifestation for their purposes. Although never fully achieved, an attempt was made to create a single coherent view over a growing, evolving mass of materials. Digital libraries today provide substantial promise of enhancing scholarly communication by providing desktop access to the full range of resources required for scientific work. We must surmount several problems to integrating digital library collections today into a single virtual collection. Such problems include the growing diversity of metadata techniques for representing networked resources; the control of evolving textual identity over distributed collections (a problem related to issues in version control); and the evaluation of information retrieval methods in digital libraries, and the impact of digital libraries on long-term user benefits such as student learning.

 

Area References


Botafogo, R.A., Rivlin, E., and Shneiderman, B. 1992. "Structural Analysis of Hypertexts: Identifying Hierarchies and Useful Metrics. ACM Transactions on Information Systems 10: 142-80.

Chakrabarti, S., Dom, B. E., Gibson, D., Kleinberg, J., Kumar, R., Raghavan, P., Rajagopalan, S., and Tomkins, A. 1999. "Mining the Link Structure of the World Wide Web." IEEE Computer 32: 60-67.

Harary, Frank, R.Z. Norman and D. Cartwright. 1965. Structural Models: An Introduction to the Theory of Directed Graphs. New York: Wiley.

Levy, David M. 1994. "Naming the Namable: Names, Versions, and Document Identity in a Networked Environment." In Filling the Pipeline and Paying the Piper: Proceedings of the Fourth Assn. of Research Libraries Symposium, ed. by Ann Okerson. Washington, DC: Assn. of Research Libraries: 153-59.

Lynch, Clifford A. 1995. "Networked Information Resource Discovery: An Overview of Current Issues." IEEE Journal on Selected Areas in Communications 13: 1505-22.

Marchionini, Gary, and Gregory Crane. 1994. "Evaluating Hypermedia and Learning: Methods and Results from the Perseus Project." ACM Transactions on Information Systems 12: 5-34.

Related Projects


I am a co-principal investigator of the Alexandria Digital Earth Prototype (Adept) Project, a project funded by the Digital Libraries Initiative II program. I serve as part of the project's evaluation team, along with Christine Borgman, Anne Gilliland-Swetland (both of UCLA) and Rich Mayer (UCSB), using similar evaluative techniques contained in this current NSF project. Furthermore, the Adept project is a component of the proposed Interlib project, a collaborative effort with several institutional members, including the UC Berkeley, Stanford, the California Digital Library and the San Diego Supercomputer Center. I am also a project director with the National Center for Research on Evaluation, Standards, and Student Testing (CRESST), a national research center funded by the U.S. Department of Education, where I am involved in technology-mediated learning assessment programs. We are currently seeking re-authorization of CRESST based on several new proposals. Two proposed projects draw directly from the research developed in this current project, including:

Finally, I am part of a team that has proposed a project using large graphs to represent the primary concepts of college geography and to investigate the use of that graph in a digital library setting. This proposal has been submitted to the ITR program at the NSF.