Return to Contents

CHAPTER EIGHT

BIBLIOGRAPHIC INFORMATION IN ELECTRONIC FORM

Whatever promise the new technologies hold, one may be certain that printed scholarly literature will continue to exist for a long time and that adequate bibliographic control is essential to scholarship. We might begin, therefore, with a fuller description of the ways in which the new technologies have been applied to the problem of access to global bibliographic information about the existing printed literature---in the first instance, information on the monographic or booklength literature and, in the second, information on the serial literature.[1]

ELECTRONIC ACCESS TO THE MONOGRAPHIC LITERATURE

Since the early 1970s, university libraries have contributed catalog records to databases maintained collaboratively. A critically important role in the collaboration has been played by two organizations, the OCLC (originally the Ohio College Library Center, now the Online Computer Library Center) and RLG (the Research Libraries Group).

Online Computer Library Center (OCLC)

OCLC[2] was founded in 1971, and its database, the Online Union Catalog, currently contains information on more than 24 million books and other materials held by more than 4,800 member libraries. The database is accessed by nearly 14,000 libraries in 46 countries for cataloging and reference purposes and in order to arrange interlibrary loans. It is growing by more than 2 million records annually; every seven days the Library of Congress adds an average of 4,200 machine-readable records. The database is extraordinarily useful not only because it permits uniformity in the content of catalog copy but also because it affords access to information about the existence of materials and serves as a record of the location of particular titles within the national system.

OCLC's database has traditionally been used principally by library professionals. In October 1991, however, the organization made available a service called FirstSearch, which permits individual patrons to access the database directly to search for materials. FirstSearch employs a menu that guides readers through a series of options; whereas the database was previously searchable only by author or title or a few other categories, the individual reader can now access the records by subject as well. Patrons pay a fixed fee for each search rather than by the minute. The system can be accessed either over the Internet, described in greater detail later, or in some instances over OCLC's new, high-speed, $70 million private telecommunications network, soon to be completed. OCLC has contracted with vendors such as H. W. Wilson Company to provide databases containing information on materials other than monographs.[3]

Research Libraries Group (RLG)

Of at least equal importance to research libraries of the type considered here are the achievements of the Research Libraries Group.[4] Founded in 1975, the RLG by 1991 had 112 members, among them universities, independent research libraries, archives, museums, and learned societies. In September 1991 its bibliographic database, the Research Libraries Information Network (RLIN), an online information system reflecting the combined holdings of the member institutions, contained 50 million catalog records for books, serials and their contents, musical scores, sound recordings, archival collections, maps, computer files, visual materials (films and photographs), and art sales catalogs. In 1992 RLG is adding a number of specialized indexes to RLIN that are now available only in print. Among the important databases already available are the Avery Index to Architectural Periodicals online, which analyzes articles from more than 700 publications; the Eighteenth Century Short Title Catalogue; and SCIPIO, the art sales catalog database, which provides citations for catalogs of sales dating from 1599 to the present, often valuable sources of information on the provenance of art objects, collection patterns, and so on.

RLIN is available to individual scholars, and readers need a personal computer and modem, a telephone line, and a searching account and password to access the records over the GTE Telenet communications network. Through one's local campus mainframe computer, the database is accessible also over the Internet. (There is currently no communications charge for this means of access.) The database is searchable by personal names, title words in any order, subject headings, and more than 40 additional categories, including the International Standard Book Number. Search results can be limited by language, date and place of publication, and holding library.

The organization is currently engaged in efforts to make RLIN records available on local campus online library catalogs. Library patrons at a particular institution will be able to search for records in RLIN as they would in their own institution's online catalog.[5] To cite an early example, at New York University, where the first phase of a three-phase project has already been completed, since March 1990 a daily average of 350 bibliographic records has been transferred electronically from the RLIN system to the Geac system at New York University's Bobst Library. The next phase entails the transfer of records created or updated on NYU's Geac system to RLIN for incorporation into the database, and the final phase will permit online searching of the RLIN database locally by NYU's faculty and students. Other libraries are proceeding with similar plans. What is envisioned, ultimately, is a situation in which all RLG libraries are linked electronically.

Of particular importance is RLG's interest in improving the quality of bibliographic information on what might be called nontraditional materials. Increasingly, scholars, specifically in the humanities and related social sciences, are making use of images, the texts of musical compositions, unpublished archival sources, ephemerae, and other such materials. Access to these sorts of materials is difficult because bibliographic information about them is either not available or is not organized in the same way as information on the published scholarly literature. As boundaries between existing humanistic disciplines are re-negotiated and scholarly information needs change in response to this development and others, the information services designed to address those needs may change accordingly. In this respect RLG's interest in developing appropriate services is potentially of great importance.[6]

Other Initiatives

In addition to the catalog records maintained by OCLC and RLG, many research libraries have also made their own online catalogs available on the Internet. Information about the existence and location of materials not contained in the OCLC and RLG databases is thus provided. Moreover, catalog copy written locally may contain idiosyncratic bibliographic information, potentially of great interest to library professionals and scholars elsewhere. One of the difficulties in making information of this type available on the Internet, however, is that there is a great deal of such information. Two publications in particular, NYSERNet: New User's Guide to Useful and Unique Resources on the Internet and Internet Resource Guide, serve as invaluable guides to some of the more important resources.[7]

The much more significant problem is that the bibliographic record was not automated at most research libraries before the late 1970s. As a result there are hundreds of thousands (in some instances millions) of catalog records not contained in individual institutions' online catalogs. Libraries will have to undertake the retrospective conversion of their card catalogs to have a single integrated record of their monographic collections. This conversion can be done manually for small collections with efficiencies being achieved by searching the OCLC or RLG databases for records matching local holdings. But for major research libraries manual conversion will be so costly as to seem unfeasible. Ultimately, all research libraries will need to put their entire catalogs into machine-readable form. The cost of doing so will be high but may be appropriate in relation to the ongoing costs of library operations and catalog maintenance. More than half of ARL member libraries report that they have already converted 90 percent or more of their card catalogs to machine-readable form. One challenge, of course, is to prevent invaluable local cataloging information from being lost in the process.

Princeton's university librarian, Donald Koepp, and its vice president for computing and information technology, Ira Fuchs, have proposed to convert the university's printed catalog records in a different way. In the first of two phases, high-speed scanning technology would be used to produce digital, bit-mapped replicates of the cards;[8] the resulting images would be stored on optical platters. Although the images would be electronically searchable only in ways approximating the kind of manual searching one does in a card catalog, they would be available online; readers would thus be able to access the entire catalog electronically, although in a two-step process, and from any properly equipped remote station anywhere in the world, since the catalog would be available on the Internet. The second phase, which would entail converting the optical, bit-mapped records into MARC (machine-readable cataloging) format, in which author, title, and other such information were adequately distinguished, would employ optical character recognition technology and automatic error-handling algorithms, rather than having the MARC tags assigned manually to each field in each record. The records could then be integrated with those in the online catalog.[9] Princeton's approach may prove to be stopgap, as the costs of other technologies decline, but would be a step forward in any event.

ELECTRONIC ACCESS TO THE SERIAL LITERATURE

The automation of the bibliographic record of the monographic literature has been paralleled by similar services providing information about the serial literature. There is an important difference between the two kinds of service, however. Libraries themselves assumed responsibility for providing bibliographic information in electronic form about their monographic collections, as a continuation of the traditional cataloging activity. Information in electronic form about the serial literature, on the other hand, is in many instances provided by commercial services. The cost implication for libraries is significant: if they wish to offer a comprehensive array of bibliographic services, they must absorb the substantial cost of acquiring the commercial services, and in many instances members of the university community demand such services in addition to traditional acquisitions.

RILA, RILM, INFO-SOUTH

The array of information such bibliographic services can provide is illustrated by RILA, Repertoire International de la Litterature de l'Art and RILM, Repertoire International de la Litterature Musicale, RILA's prototype. RILA provides bibliographic information (and in many instances abstracts) for current publications in the history of Western art: monographs, book reviews, conference reports, exhibition catalogs, periodical articles, festschriften, and other publications. It is produced by the Getty Art History Information Program (AHIP) and has recently merged with the Repertoire d'Art et d'Archaeologie, a parallel French bibliography produced by the Centre Nationale de la Recherche Scientifique. More than half the records contain abstracts written by staff members of the AHIP whose responsibility it is to review the current literature, locate and identify publications worthy of being indexed and abstracted, and write brief synopses. As of January 1991 the database contained more than 130,000 records on items published from 1973 on. The bibliographic records and abstracts are available in printed and electronic form. The comparable publication in the history of music, RILM, is produced by the International Musicological Society and the International Association of Music Libraries. It shares many of its essential characteristics with RILA, with two exceptions: there is a five-year interval between publication of the literature and publication of the index, and many RILM abstracts are written by the authors themselves.

Both databases are available online through the DIALOG Information Retrieval Service, from Dialog Information Services, Inc., a Knight Ridder Company. Similar databases are available through WILSONLINE, from H. W. Wilson, and ORBIT Search Service, a division of Maxwell Online, Inc., which provides electronic versions of such scientific indexes as Chemical Abstracts.[10]

Another example is INFO-SOUTH, the Latin American Information System, which is a comprehensive database of abstracts of the contents of 1,600 publications on all aspects of society and change in South America, Central America, and the Caribbean. Included are newspapers, news magazines, and journals. The University of Miami manages INFO-SOUTH and permits subscription by either hourly rate or annual fee for unlimited use.[11]

It would be difficult to exaggerate the advantages to scholars of having such bibliographic information available in electronic form, in part because of the nature of the information itself, which extends to the level of the individual item (the individual article or book review), and in part because of the ability to search the literature completely for virtually all items of interest and, in contrast with manual searching, with considerable ease. As Michael L. Dertouzos has noted, such services "relieve many of the repetitive, boring and unpleasant tasks related to processing and communicating information."[12] However, one should not underestimate the cost of utilizing such services. DIALOG's promotional literature suggests that "[a] typical 10-minute search can cost from $6 to $16.50. (These examples include telecommunications costs but do not include offline print charges.)"[13] Accordingly, while some university library systems have chosen to make such online services available directly to the individual reader, others, understandably, have restricted their use to members of the library staff so as to keep searching costs to a minimum.

SERVICES OFFERING INDIVIDUAL ACCESS TO DATABASES

Many scholars have argued for individual access to the databases for the reason that "scholars need to be guided by their instincts when they search databases just as when they search card catalogues or browse the stacks," as one proponent of individual access has phrased it.[14] In response to the interest of individual scholars in having direct access to indexes of the type described here, some institutions have purchased computer tapes containing the bibliographic records and the requisite software from the vendors and have made the databases available on local-area networks, which saves the cost of the long-distance telecommunications connection. In such instances individual users may have a menu of options available to them listing various kinds of campus information services: the online library catalog, various bibliographic services, and so on. Individual patrons may then search whichever database is pertinent to their purposes. In other instances vendors have made portions of their complete databases available on CD-ROM, and libraries have made the discs available as they would traditional printed indexes. Although the discs share with other electronic media the advantage that one can easily search the database, vendors have tended to stipulate in the rental or sales agreement that they not be mounted on a local network.[15] In such cases they share with printed indexes the disadvantage of being available to only a single patron at a time, as contrasted with the online databases, accessible by more than one patron simultaneously.[16]

Both OCLC and RLG offer yet a third option; they have acquired some of the existing indexes directly from the vendors and have mounted them on their information systems. OCLC, for example, has contracted with vendors such as the H. W. Wilson Company to add to the existing databases already available on OCLC's system.[17] RLG, too, has added various indexes to those available on RLIN. For a fixed annual fee institutions are permitted unlimited searching of some of the files and thus enjoy the advantages of having such indexes accessible locally without having had to assume responsibility for the technical demands involved in mounting them.[18]

A further important issue is that many disciplines, in the humanities and related social sciences in particular, either do not have bibliographic services of the type described here or are dissatisfied with the ones they do have.[19] Here again, the Research Libraries Group has played an important role in working with learned societies to identify information needs and assess the adequacies (or inadequacies) of existing bibliographic services.[20]

An experiment conducted by Dialog Information Services at Earlham College in Indiana was designed to gauge faculty and student response to the availability of its services and gather information about their use of the databases.[21] Dialog provided Earlham with a year's free access to its bibliographic and full-text databases and absorbed the telecommunications charges during the academic year 1990-91 and during the following academic year permitted unlimited searching at a discounted rate. The college has received a $200,000 gift from an alumnus to endow online searching. During the first year of the experiment, more than 90 percent of the faculty and 80 percent of the students accessed the databases at some time, although the percentages of those making extensive use of the services were probably lower. Many faculty members testified to the promise these services hold for scholarship and, notably, for teaching. As one faculty member observed:

In Notes on Virginia, Jefferson described the process[:] "A patient pursuit of facts, and cautious combination and comparison of them, is the drudgery to which man is subjected... if he wishes to attain sure knowledge." Jefferson is still right about the patient pursuit of facts.... We have, however, taken much of the drudgery out of the process and made it easier to find sources, but we still have to read carefully---probably more carefully than ever---and we still have to think. The difference is that searching no longer takes much time and energy from the scholarship of thought.[22]

The experience at Earlham gives some sense of the utility of these services and of the importance to scholarship of facilitating access to information about information. To be sure, there is a superabundance of information available, and in attempting to establish bibliographic control over the literature on a particular topic, scholars face formidable challenges resulting from that very superabundance. Moreover, as some faculty members at Earlham suggested, there is the risk that easy access to information will lead some students to substitute the exhaustive assembling of facts and others' opinions for their own critical evaluation and interpretation of issues.

Relatively complete access to global bibliographic information is a critically important objective. Scholarly arguments based on thorough knowledge of the professional literature are at minimum better informed and obviously to be preferred over those that are less firmly grounded. At the same time the cost to institutions of the services that provide access to such information should not be minimized. In an era of limited resources, difficult decisions will have to made about possible tradeoffs in acquisitions between traditional printed materials, which will continue to be fundamental, and services like those described here. Indeed, one of our purposes is to highlight some of the tensions that now exist and will continue as the new information technologies are found to have ever more useful applications to scholarship. The argument is that providing scholars with readily accessible information about the existence and location of scholarly materials held elsewhere is in many respects a more important objective than building a free-standing, self-sufficient local collection.[23]

Endnotes

[1] By global information we mean information on scholarly literature beyond that contained in one's own local research library. For purposes of this discussion the term "monographic literature" applies not only to monographs but also to textbooks, editions of primary texts, and other such materials where the bibliographic record would ordinarily consist solely of information on the entire volume. That literature is to be contrasted with the serial literature and related kinds of writings (a collection of essays by several different authors, a conference report, a festschrift), where the most useful bibliographic information would extend to the level of the individual article within the collection.

[2] The information in the following two paragraphs was taken from two articles: "Bibliographic Data Base Marks 20th Anniversary," The Chronicle of Higher Education 38 (September 4, 1991):A26; and David L. Wilson, "Researchers Get Direct Access to Huge Data Base" The Chronicle of Higher Education 38 (October 9, 1991):A24-A25, A28.

[3] On the responses at one institution to FirstSearch, see Henry S. Whitlow, "Verdict Is In on FirstSearch at Bluefield College," SOLINEWS 18 (Spring 1992):9-10.

[4] Information on RLG is taken from David L. Wilson, "Research Libraries Group Seeks New Focus and New Members," The Chronicle of Higher Education 38 (January 22, 1992):A21-A22 and two promotional pieces: RLG and Personal Access to RLIN: How Individuals Can Search an On-Line Catalog of Research Libraries' and Archives' Collections (The Research Libraries Group, Inc. 1989).

[5] See RLG; also, Jennifer Hartzell, "RLG and NYU Complete Phase One of Project for Electronic Record Exchanges Between RLIN and Geac Local Systems," Press Release, The Research Libraries Group, Inc. (September 21, 1990).

[6] On changing scholarly information needs and their implications for libraries, see in particular Scholars and Research Libraries in the 21st Century, ACLS Occasional Paper, no. 14 (New York: American Council of Learned Societies, 1990); and Lawrence Dowler, "Among Harvard's Libraries: Conference on Research Trends and Library Resources," Harvard Library Bulletin, n.s. 1 (Summer 1990):5-14. On RLG's initiatives in this area, see Wilson, "Research Libraries Group."

[7] NYSERNet: New User's Guide to Useful and Unique Resources on the Internet, A Project of the NYSERNet K-12 Networking Interest Group and the NYSERNet/NYS Library Networking Interest Group for Libraries, Version 2.0 (Syracuse, New York: NYSERNet, 1991) and Internet Resource Guide (Cambridge, Mass.: NSF Network Service Center, BBN Systems and Technologies Corporation, 1989). The second of these publications provides periodic updates as new resources become available on the Internet. Both publications list online library catalogs that are available; among them are the SUNY Buffalo Online Catalog, Colorado Association of Research Libraries, City University of New York Online Catalog, SUNY Binghamton Online Catalog, and the New York Public Library Online Catalog (listed in NYSERNet: New User's Guide), and Boston University, University of California and California State University, The University of Michigan's Online Catalog, Emory University Libraries Online, The Library Catalog for the University of Colorado at Colorado Springs, The Catalog of the University of Pennsylvania Libraries, The University of Wisconsin Madison and Milwaukee Campuses Network Library System, University of Utah Card Catalog System, Northwestern University LUIS Online Catalog, University of Maine System Library Catalog, University of Illinois at Chicago, Cleveland Public Library Catalog, Penn State University Library Information and Access System, Harvard Online Library Information System, Cataloging from the Library of Congress, The Online Catalog, Princeton University Libraries, The Cal Poly, San Luis Obispo, Kennedy Library's Online Catalog, and University of Iowa Libraries (listed in Internet Resource Guide).

[8] The technology is similar to that used for billing purposes by American Express to produce an image of the triplicate form a client signs at the time of a transaction.

[9] There would remain the considerable problem of converting catalog records in nonroman type. For a general discussion of the issue of retrospective catalog conversion, see the special issue on retrospective conversion of the IFLA Journal 16 no. 1 (1990).

[10] See, for example, the Directory of Online Databases Volume 12, Nos. 1 and 2, January 1991 (New York: Cuadra/Elsevier, 1991).

[11] Mick O'Leary, "INFO-SOUTH Fills Foreign Data Gap," Information Today 9 (June 1992):13-14.

[12] "Communications, Computers and Networks," Scientific American (September 1991):30-37, especially p. 37.

It is important to remember, however, that the automated record in most disciplines extends back only a few years. Here again, the scholarly community faces the enormous problem of retrospective conversion of the bibliographic record of the serial literature so that comprehensive searching is possible. Clearly, information needs differ between the sciences and the humanities in this respect; in most scientific disciplines it is nowhere nearly as important to be able to search last decade's literature as it is in the humanities (although historians of science will want to be able to do so). On this point, see Douglas Greenberg, vice president of the American Council of Learned Societies, "Technology, Scholarship, and Democracy, or, You Can't Always Get What You Want," a talk delivered at the fall 1991 meeting of the Coalition for Networked Information, Washington, D.C. A revised and condensed version of the talk was published under the same title in EDUCOM Review 27 (May-June 1992):46-51. Here we think it is important to add, once again, that abstracting and indexing services are tools appropriate to print literature. Electronic versions of full texts that are fully machine searchable may to some extent obviate the need for such services.

[13] See DIALOG Database Catalog 1991 (Palo Alto, Calif.: Dialog Information Services, Inc., 1991).

[14] Greenberg, "Technology, Scholarship, and Democracy," 11.

[15] In some instances individual institutions have sought permission from the publishers for networked access, but the publishers for the most part have prohibited such access and have instead limited it to a single person at a time. Such limitations by publishers are likely to change---indeed, are already changing---as concerns about revenue are shaped by different factors.

[16] Clearly, the situation with respect to the available kinds of services and options of this type is changing rapidly. For information about some of the choices institutions are currently making in response, we are grateful to Patricia Battin, president of the Commission on Preservation and Access, Marvin Bielawski, assistant university librarian for technical services at Princeton University, Paula Kaufman, dean of libraries at the University of Tennessee, Knoxville, Daniel Oberst, director of advanced technology and applications, Office of Computing and Information Technology at Princeton University, and David Penniman, president of the Council on Library Resources.

On the general situation as it is at present, see also Fran Spigai, "Information Pricing," Annual Review of Information Science and Technology 26 (1991):39-73. For this reference, we are indebted to David Penniman.

[17] Wilson, "Researchers Get Direct Access." The indexing and abstracting services already available through FirstSearch are listed on pp. A24-A25.

[18] On RLG's service, called CitaDel, see the promotional piece CitaDel: The Complete Citation and Document-Delivery Service from the Research Libraries Group (Mountain View, California: The Research Libraries Group, 1992). Among the abstracting services available are Dissertation Abstracts, Newspaper Abstracts, and Periodical Abstracts. In addition, RLG has mounted a number of indexes previously available only in print, including the Hispanic-American Periodicals Index, the Index to Foreign Legal Periodicals, and Technology and Culture's bibliography for the history of technology. RLG also offers a companion service, to be discussed in more detail later, that delivers the full texts of most articles cited in the CitaDel files.

One of the general problems institutions face in making information resources of this type available to patrons is the wide variety of user interfaces; there can be as many different kinds of protocols for access to such resources as there are vendors, media, and so on. This point will be discussed in greater detail in Chapter 11.

[19] In "Information Access: Our Elitist System Must Be Reformed," The Chronicle of Higher Education 38 (October 23, 1991):A48, Douglas Greenberg suggests that adequate bibliographies in electronic form are generally unavailable and argues for the development of such services. For a differing opinion on the question of the general availability of bibliographic services, see David Lewis, "Letters to the Editor: Is Access Equitable?" The Chronicle of Higher Education 38 (November 20, 1991):B4.

Richard P. Kollin and James E. Shea, in "New Trends in Information Delivery," Information Services and Use, 4 (1984):225-227 (especially p. 227), and Miriam A. Drake and Kathy G. Tomajko, in "The Journal, Scholarly Communication, and the Future," Serials Librarian 10 (1986):289-298 (especially p. 292), discussed the further problems of overlap (two or more services indexing an article) and "underlap" (no coverage of some journals by such services); one possible solution they suggest is "gatewaying," which permits complementary databases to be integrated with one another in a way that prevents duplication.

[20] In the general promotional piece RLG (p. 15), the charge of the Task Force on Scholarly Bibliographies is described as follows: "Despite the annual bibliographies produced by many of America's learned societies, access to periodical literature and informal publications is still inadequate for many fields and interdisciplinary areas. During 1992 RLG is working with the American Council of Learned Societies and some of its constituent societies, especially in history and area studies, to assess the inadequacies of bibliographic access, to define useful enhancements, and to establish a pilot project for cooperative production of an online, multipurpose bibliography."

See also RLG's two excellent publications Information Needs in the Humanities: An Assessment, prepared for the Program for Research Information Management of the Research Libraries Group, Inc., Principal Author: Constance C. Gould (Stanford, Calif.: The Research Libraries Group, Inc., 1988) and Information Needs in the Social Sciences: An Assessment, prepared for the Program for Research Information Management of the Research Libraries Group, Inc., Principal Authors: Constance C. Gould (Economics, Political Science, Psychology), Mark Handler (Sociology, Anthropology) (Stanford, Calif.: The Research Libraries Group, Inc., 1989). Both publications contain expert analyses of changing scholarly practices and information needs in various disciplines and what the RLG might do to help address them.

We are also grateful to RLG for sharing various informative memoranda and unpublished materials on their initiatives.

[21] On the experiment, see Amy Beth, "When Cost is No Factor: The Impact on Faculty of Unlimited Access to DIALOG," Information Searcher 4 (1991):3ff., and Jerry Woolpy, "The World in a Keystroke," Earlhamite 111 (Fall 1991):5-6.

[22] Woolpy, "World in a Keystroke," 5-6.

[23] See, for example, Paul Gherman, "Setting Budgets for Libraries in Electronic Era," The Chronicle of Higher Education 37 (August 14, 1991):A36.