ACRL News Issue (B) of College & Research Libraries 2 0 6 / C & R L N e w s ■ F e b r u a r y 2001 SCHOLARLY COMMUNICATION Ivy Anderson, Gail McMillan & Ann Schaffner, editors CrossRef The missing link by Ed Pentz iting articles in references is one o f the foundations of the scholarly com m uni­C cation system. With references, authors make explicit links betw een their research and other articles that may, on the surface, appear u n ­ related. Eugene Garfield, the founder o f ISI, tells us that through references, “authors should formally assert and verify their ideas are origi­ nal and do not replicate discoveries already reported in the archive.”1 Links enable users to see the body of primary literature as an interconnected collection of articles. W hen w e can move from a reference to the full text of a cited article in one or tw o clicks, w e will truly realize the benefits of electronic publi­ cation. With the advent of electronic journals, a lot o f attention has been given to multimedia capabilities and features like online peer re­ view, but a full system o f reference links is an essential feature that has been missing to date. Peter Boyce of the American Astronomi­ cal Society (AAS), reporting in the pioneer­ ing online journal “Astrophysics Journal Let­ ters,” w rote in 1997, “Reader feedback con­ tinues to em phasize the im portance of links by w hich it is possible to retrieve referenced articles… ”2 Because reference linking is so important, publishers o f scholarly journals have an eco­ nomic imperative to provide reference links— journals w ithout links will be seen as less valuable or useful than those w ith links. Many online journals have some links and have had them for a num ber of years. H ow ­ ever, most o f this linking has b een within a very narrow, focused subject area, betw een large secondary database publishers and large primary publishers, or within proprietary jour­ nal systems. For example, the astronomy literature is very w ell linked through the Astrophysics Data System (ADS)3 and HighWire Press1 has extensive reference links betw een HighWire journals and PubMed.’ What has been miss­ ing is direct links betw een primary publish­ ers and links betw een secondary and smaller publishers. Reference linking has been held back by the need for bilateral linking agree­ ments betw een individual publishers; draft­ ing such agreem ents is a laborious and time­ consum ing legal process. To have abundant links, a publisher w ould have to sign agree­ m ents w ith hundreds of organizations, an unw orkable proposition. It is especially dif­ ficult for smaller publishers w ithout exten­ sive staffs to participate in reference linking. C ro ssR e f and PILA Aware of the im portance of linking and of the inefficiency of signing bilateral linking agreements, publishers took the unusual step of cooperating to set up CrossRef, a collabo- A b o u t th e a u th o r Ed Pentz is executive director o f CrossRef, Publishers International Linking Association, in Burlington, Massachusetts, e-mail: epentz@crossref.org mailto:epentz@crossref.org C & R L N e w s ■ F e b r u a r y 2001 1 2 0 7 Editors' introduction As academic librarians, we are all too famil­ iar with the challenges that researchers face in following ideas through a long chain of citations. From article to sh elf to m icrofilm to InterLibrary Loan and back again, the process of tracking dow n articles that form the pieces of an intellectual puzzle has often b een a dif­ ficult one. One faculty m em ber I know used to refer to the process as a “lifetime of biblio­ graphic frustration.” In contrast, one o f the joys of information on the Web is the ability to move quickly rative reference linking service. At the end of 1999, a group of leading scientific, technical, and medical (STM) publishers joined to form the nonprofit, in d e p e n d e n t organization, Publishers International Linking Association, Inc. (PILA), w hich operates CrossRef from a central location in Burlington, Massachusetts. The PILA Board of Directors includes rep­ resentatives from AAAS (Science), Academic Press (Harcourt), American Institute of Phys­ ics, Association of C om puting Machinery, Blackwell Science, Elsevier Science, Institute of Electronics an d Electrical Engineering, Kluwer, N ature, O xford University Press, Springer Verlag, and John Wiley & Sons. Even though CrossRef was incorporated in January 2000, the CrossRef system is al­ ready u p and running. There are well over 60 publishers participating in CrossRef, ac­ counting for nearly 3,000 journals with about 2.1 million article records. More than 60% of CrossRef mem bers are nonprofit publishers, and, while STM w as an initial focus, CrossRef now covers all areas of scholarly publishing. CrossRef functions as a sort o f digital switchboard. It holds no full-text content, but creates linkages through Digital Object Iden­ tifier (DOI) numbers, w hich are tagged to article m etadata supplied by the participat­ ing publishers. A researcher clicking on a link will be connected to a page on the publisher’s Web site show ing a full bibliographical cita- through a series of related sites. With the advent of electronic journals, many o f us expected to see journal citations transformed into instant links from full-text article to full- text article, regardless of publisher. This transformation has been slow to arrive, for a variety of technical, economic, and legal reasons. O ur column for this issue describes an important effort to make electronic linking a reality: Cross Ref.— Iv y A nderson, G ail McMillan, a n d A n n Schaffner tion of the article, and, in most cases, the abstract. The format of the link is determ ined by p u b lish e r p referen ce; for exam ple, a CrossRef b u tto n o r “Article” in html. The reader can then access the full-text article through the appropriate mechanism; subscrib­ ers will generally go straight to the text, while others will receive information on access via subscription, docum ent delivery, or pay-per- view. It is im portant to note that CrossRef acts “behind-the-scenes” and collects only a mini­ mal am ount of bibliographic metadata. Ab­ stracts and full-text articles rem ain at pu b ­ lishers’ sites, and access to the material is controlled by publishers’ access control sys­ tems. This has b een referred to as “distrib­ uted aggregation.” Users w ho are subscrib­ ers to the cited journal will in most cases have their In tern et Protocol (IP) address checked and be able to access the full-text content seamlessly. CrossRef is not a search system. End users do not access CrossRef di­ rectly; organizations access CrossRef to look u p DOIs to create full-text links to scholarly journal articles. C re a tin g and lin k in g to DOIs For participating publishers, CrossRef offers three main services: the depositing of article m etadata in the CrossRef database, the sub­ mission o f the references in those articles for A b o u t th e ed ito rs Ivy Anderson is coordinator for Digital Acquisitions at Harvard University, e-mail: ivy_anderson@harvard.edu; Gail McMillan is director o f Digital Library a n d Archives at Virginia Polytechnic Institute a nd State University, e-mail: gailmac@vt.edu; Ann Schaffner has been an academic librarian fo r m ore than 20 years a n d is currently a fu ll time MBA student at Simmons College, e-mail: ann.schaffner@simmons.edu mailto:ivy_anderson@harvard.edu mailto:gailmac@vt.edu mailto:ann.schaffner@simmons.edu 208 / C&RL News ■ February 2001 Figure 1: "W orkflow fo r DOI Reference Linking" crossref the purpose o f obtaining their DOIs, and the creation o f links using those DOIs. The first step is for the publisher to ob­ tain a DOI prefix from the International DOI Foundation (IDF) at http://w w w .doi.org. The cost o f this service is covered by CrossRef m em bership. The p ublisher th en subm its minimal article metadata (journal title, article title, volume, issue, page, and first author) to the m etadata database (MDDB) w ith the DOI and URL. Metadata is to be in an XML-based D ocum ent Type Definition format, the stan­ dards o f w hich are provided on the CrossRef Web site (http://w w w .crossref.org). As part of the submission process, CrossRef registers the article’s DOI and URL in the central DOI Directory, run by the IDF. The publisher then submits the reference citations contained in each journal article to the Reference Resolver (RR), a front-end com­ po n en t o f the MDDB. The RR allows the re­ trieval o f DOIs, enabling the publisher to cre­ ate links. The format and protocol for these submissions are also covered on the CrossRef Web site. The publisher uses the DOI to create a norm al DOI link (see Fig­ u re 1). The DOI is sent to the DOI Directory and automatically resolved to the URL deposited by the publisher (see Figure 2). An exam ple of a DOI is 10.1006/ jmbi. 1995.2434— it is for an article from Academic Press's Jo u rn a l o f Mo­ le c u la r Biology, av a ila b le o n th e IDEAL system. “10.1006” is Academic Press’s prefix (each publisher has a unique prefix). After the “/ ” the pu b ­ lisher determ ines how to identify the article. In this case Academic Press uses a four-letter code for the journal, the year of acceptance, and a sequen­ tial article number. This DOI as a link w ould ap p ear as: h ttp ://d x .d o i.o rg / 10.1006/jmbi.1995.2434. Clicking on the link will take the user to the ab­ stract page on the IDEAL system. Some publishers are using SICIs (Serial Item and Contribution Identifier)6 o r PIIs (Publisher Item Identifiers)7 for their DOIs. The DOI is a very pow erful tool. Reference linking until now has de­ pen d ed largely on algorithmic links, w hich employ URLs. Since a URL is not a true identifier, but a pointer to a loca­ tion on a particular machine, one can reach a “file not found” error message if the file has b een moved. A more serious problem is that this approach, like bilateral agreem ents, is not scalable; every publisher has to know and track changes in the linking format of every other publisher, w hich becom es an overw helm ing task as linkage proliferates. By taking the standards-based DOI ap ­ proach, in w hich a given DOI is always asso­ ciated w ith a specific article, CrossRef has rem oved the need for participants to archive linking schemes. If a publisher changes its URL, only the central DOI Directory needs to be updated and each DOI will automati­ cally resolve to its new URL. The International DOI Foundation ensures interoperability am ong different user com ­ munities. T hrough close cooperation with IDF, CrossRef has launched the first large- scale, practical DOI application to address the sophisticated dem ands o f readers of sci­ entific and scholarly journals. CrossRef has Figure 2: "DOI Resolution" erossref http://www.doi.org http://www.crossref.org http://dx.doi.org/ C&RL News ■ February 2001 / 209 also become the first official DOI Registra­ tion Agency, granting it the means to assign DOI prefixes to CrossRef members, and to register DOIs in the system. As a collaborative venture, the success of CrossRef depends on the cooperation of its members. Publishers must be prepared to receive incoming links at the time of metadata submission. They are also expected to main­ tain the accuracy o f their metadata, DOIs, and URLs, and to provide information on article access. CrossRef m em bership is open to primary scholarly publishers. However, many other organizations can benefit from using CrossRef to look up DOIs to create links to full-text articles. To fill this need, CrossRef has cre­ ated an affiliates category. Affiliates are non­ members, such as secondary database pro­ ducers, subscription agents, and abstracting and indexing services w ho can sign up to use the CrossRef system. CrossRef costs the researcher nothing; its expenses are covered by charges to m em ber publishers for depositing their metadata, re­ trieving DOIs, and annual m em bership fees. There are no charges for clicking on links. Affiliates pay an annual administrative fee and retrieval fees for looking up DOIs. Li­ brary affiliates can pay a flat fee of $500 for unlimited access to DOI look up. Current fee schedules are posted on the CrossRef Web site. Fees are designed to cover costs based on use of the system (so small publishers pay lower fees than larger publishers do). CrossRef itself has no stake in publishers’ decisions regarding their charges for content access. Inevitably, problem s un iq u e to the digi­ tal realm have arisen. O f m ost co n cern to libraries is w h at is k n ow n as the “a p p ro ­ priate co p y ” issue. Since a user at an in­ stitution may have access to a given ar­ ticle th ro u g h m ore th an o n e source, he or she m ust be able to discover w hich is the “a p p ro p ria te ” copy. For example, a library user should not pay for an article at the publisher’s Web site if it is also available through a library sub­ scription to Ovid, EBSCO Online, or in the library's print holdings. The question of how to provide “localized links” so that users can get to appropriate copies has been under discussion for several years.8 To move this process along, CrossRef co­ sponsored the “Workshop on Localization in Reference Linking” with NISO, DLF, CNRI, and IDF.9 At this meeting a general architec­ ture for localized linking was outlined and a practical prototype of this type of linking is now being planned. The prototype will in­ volve CrossRef, IDF, DLF, publishers, librar­ ies, an d others w orking together. DOIs, metadata, and OpenURL are all important parts of the localized linking solution and they all w ork together. CrossRef is committed to w orking with libraries and others on solu­ tions to these problems. Another major issue is the crucial ques­ tion of how digital content is to be archived. Here, too, CrossRef is seeking the answers w e will all need in the years ahead. For ex­ am ple, C rossR ef h o p e s to link to su ch archiving systems as JSTOR, which scans jour­ nal issues, in som e cases going back to the 1800s. Assigning DOIs to these older articles means that they can be included in the link­ ing network. W hen a user can click on a cita­ tion to an article from the 19th century and get to the full text online, scholarly commu­ nication will truly be transformed. CrossRef provides the “missing link” in linking, making broad-based linking efficient and manageable for large and small publish­ ers. CrossRef is available for other organiza­ tions to use and can benefit the entire schol­ arly c o m m u n ic a tio n s p ro c e s s. B ecau se CrossRef has taken the approach of using o p e n s t a n d a r d s , it w ill n e e d to b e interoperable with other linking systems. The DOIs and metadata that CrossRef use lay the groundw ork for more sophisticated linking in the future. Notes 1. Eugene Garfield, “The Concept of Cita­ tion Indexing” Current Contents. January 3, 1994, http ://w ww . isinet. com /isi/hot/essays/ l.html. 2. Peter Boyce, “Electronic Publishing: Ex­ perience is Telling us Something.” Serials Review 23, no. 1 (1997): 1-10. 3. NASA A stro p h y sic s D ata System , http://adsw w w .harvard.edu. 4. HighWire Press, http://w w w .highw ire. org/. (continued on page 228) http://adswww.harvard.edu http://www.highwire 210 / C&RL News ■ February 2001 228 / C&RL News ■ February 2001 R e t i r e m e n t s J u lita C. A w k a r d has retired, after 38 years of service, as university librarian at Florida Agricultural & Mechanical University. D e a t h s P e r r y D . M o rr iso n , 81, retired science librar­ ian at the University o f O regon (UO), died D ecem ber 7. From 1949 to 1963, Morrison w as head social science librarian at UO. He left there in 1963 to becom e head librarian and director of the library science program at Sacramento State College (now Sacramento State University). From 1965 to 1967 he was o n the faculty o f the School o f Librarianship at the University of Washington in Seattle. He returned to UO as a faculty m em ber in its new School of Librarianship, later serving as its dean from 1971 until 1973. With the sus­ pension o f the school, he becam e a science librarian until his retirement in 1982. Morrison e d i t e d th e P a c ific N o r th w e s t L ib ra ry Association’s quarterly from 1967 to 1972. He also served as president of the O regon Li­ brary Association from 1961 to 1962. Morrison was published in m any library publications an d was for m any years a contributor to Collier’s Encyclopedia Year Book about Or­ egon libraries, enjoyed reviewing professional library publications, and w rote articles about local history. ■ ( “Cross R e f ” co ntinuedfr o m pa g e 209) 5. PubMed, http://www.ncbi.nlm.nih.gov/ entrez/query.fcgi?db=PubMed. 6. ANSI/NISO Z39.56-1996 Serial Item and Contribution Identifier (SICI). 7. Lorrin Garson, “Publisher Item Identi­ fier as a Means o f D ocum ent Identification,” http://pubs.acs.org/journals/pubiden.htm l. 8. Priscilla Caplan and William Arms, “Ref­ erence Linking for Journal Articles” D-Lib M agazine (July/August 1999), http://w w w . dlib. org/dlib/july99/caplan/ 07caplan. htm l. 9. NISO/DLF/CrossRef W orkshop o n Lo­ calization in Reference Linking July 24, 2000— CNRI, R eston, Virginia, M eeting R eport, http://w w w .niso.org/CNRI-mtg.htm l. ■ http://www.ncbi.nlm.nih.gov/ http://pubs.acs.org/journals/pubiden.html http://www.niso.org/CNRI-mtg.html