ACRL News Issue (B) of College & Research Libraries C&RL News ■ February 2002 / 117 Casting a wide net The Early English Books Project meets at Northwestern by J e f f r e y G a r r e tt T he history o f the computer is widely known to have begun with Pascal’s add­ ing machine o f 1642 and Thomas Hobbes’s reflections on “computation” in 1656. Less known is that the poets o f the 16th and 17th centuries imagined what some day would becom e the Internet and the Web. In a poem of 1611, John Donne antici­ pated the way an abstract network could be “thrown upon the heavens” and, like the nets used by fishermen, bring the universe to us: For o f Meridians, and Parallels, Man hath weav’d out a net, and this net throwne Upon the Heavens, and now they are his owne. Loth to goe up the hill, or labour thus To goe to heaven, we make heaven come to us.1 Hyperlinks, image maps, the use o f icons as memory devices representing larger mean­ ings, even interactivity between medium and reader (16th-century almanacs performed as­ tronomical calculations for the user and con­ tained blank pages on which readers could add their own observation data), all o f these “tools” were anticipated— and many fully de­ veloped— in early modern Europe.2 The 21st century is in the process of re­ paying the debt it owes to the philosophers, mathematicians, and dreamers of the 16th and 17th centuries. Early English Books Online When finished, Early English Books Online (EEBO) will bring almost every work pub­ lished between 1475 and 1700 in England, 125,000 in all, to the computer monitor in your home or office, not only as browsable reproductions of printed pages, to be read like a book on the screen, but also as elec­ tronic text, in which every occurrence of a requested word or phrase can be located and collated with every other occurrence o f that same word or phrase elsewhere in a vast elec­ tronic library o f Renaissance and Restoration England. EEBO traces its origins to an enormous microfilming project begun in 1938, which reproduced on film the entire corpus o f early printed books in the British Museum Library. In the 1990s, the creators and owners of this microfilm, University Microfilms (UMI) of Ann Arbor, Michigan, began to transfer this vast archive from film to bits, opening up the pros­ pect o f manipulation by computer and ac­ cess through digital networks. Northwestern University Library and many other U.S. librar­ ies acquired EEBO in 1999.3 About the author Jeffrey Garrett is acting assistant university librarian fo r collection management at Northwestern University, e-mail: jgarrett@northwestern.edu C o l l e g e & R e s e a r c h L i b r a r i e s news mailto:jgarrett@northwestern.edu 118/ C&RL News ■ February 2002 EEBO’s Text Creation Partnership Reading centuries-old texts online is all well and good, especially if the alternative is a trip to a microfilm reader at a university library on a cold winter’s night, but what if we want to know how often Shakespeare’s name (or Cromwell’s or King Charles’s) was invoked by writers o f the 17th century, and in what context? For this, we need electronically searchable text— and a page image, regard­ less how accurately it re­ produces the original, does not relieve a human reader of the need to scan every page to locate these occur­ ren ces. For m illions of pages this could take a life­ time and, in the history of scholarship, it often has. This is w here EEBO currently stops, but where another recently e stab ­ lished initiative, the EEBO Text Creation Partnership (EEBO-TCP), has received a sweeping mandate. As of January 2002, 53 universities in the United States, Great Britain, and elsew here in the world (N o rth w e s te rn am ong The reading w heel, from Agostino R am elli's Le d iv e rse e t a r tific io s e machine (Paris, 1588). Reproduced by p erm issio n o f th e H arry Ransom H u m a n itie s R esea rch C e n te r, University of Texas at Austin.4 them) were contributing members of the TCP, work­ ing under the leadership of the University of Michigan, O xford U niversity, the Council on Library and Information Resources (CLIR), and ProQuest Information and Learn­ ing (formerly University Microfilms). The initial goal o f the TCP is to create searchable text versions o f 25,000 EEBO titles over a five-year period, concentrat­ ing on the first editions listed in a stan­ dard compilation, the N ew C a m b r id g e B ib ­ lio g r a p h y o f E n glish L ite r a tu r e (NCBEL). This process involves much more than just creating a digital copy of the text of these works. It also involves what is called “tag­ ging,” which means labeling parts of a text as “author,” “title,” part of poem, a foreign word, or occurring on a particular page, in a par­ ticular chapter, or in a publication o f a par­ ticular year. The power of tagging is that it allows readers to limit a search for words to particular parts of a text or to works pub­ lished during a given range o f years. There­ fore, tagging allows for searches of enormous precision or “granularity.” EEBO-TCP Sum m er Camp Last July, representatives of six TCP member institutions located in the Midwest met at Northwestern University Library for a two-day summer camp of lab time and intensive con­ sultations to help create the all-important interface that will stand between users of the texts and (literally) mil­ lions of pages of electronic text. Imagine this interface as a kind o f dashboard on your computer, with ad­ justable controls a driver w ill n e e d to n a v ig a te through this new cyber­ space o f Renaissance e- text. It is, of course, very important that the drivers get the controls that they will need to get safely and quickly to their destination. Among the participants o f E EB O -TC P Sum m er Camp were students and faculty members o f the English and history depart­ ments at Indiana Univer­ sity, Notre Dame, Univer­ sity of Michigan, Michigan State, Wisconsin, and Northwestern. Librarians, especially subject specialists in English and history, were paired with their respective faculty, and computing specialists— some also with Ph.D.s in humanities disci­ plines— noted carefully what the future users o f the full-text database said they would need. Problem s and w ish es It is clear that early modern books pose par­ ticular problems for digital projects such as this one. How, for example, do you search for occurrences of a particular word if that word can be spelled many different ways? The word “green,” for example, could be spelled “grene”— but “grene” was also an al­ ternative spelling for our m odern word “grain.” C&RL News ■ February 2002 / 119 Pagination is often lacking. In that case, what navigational aids can be given to the electronic reader? Scholars warned that modem genre cat­ egories, even as general as “fiction” and “nonfic­ tion,” cannot be safely applied retrospectively to early modern books. Histories of England, to give just one example, may be presented as richly al­ legorical epic poems. If not ours, then what genre categories should be used? It also became clear that social scientists have different needs and expectations of the TCP than humanists do. A trivial example would be the different meaning that the word “act” has for humanists, for whom it is a part of a play, and for legal historians, for whom an “act” is, of course, a law promulgated by Parliament. Historians present pointed to the importance of capturing electronically the text of the 22,000 so-called “Thomason Tracts,” collected between 1640 and 1661 by London publisher and book­ seller George Thomason— an extraordinarily valuable record of the turmoil in England dur­ ing the Civil War. Humanists emphasized, on the other hand, the need to tag typographical information in EEBO texts. This includes orna­ mental initials, signature information, catch­ words, and other information, much of which was used by printers to properly assemble a book after the pages had been printed and cut. These details are also important for distin­ guishing different editions or even different printings of the same book. Everyone present was concerned about how a sub-coφus of five, ten, or hundreds of texts might be identified based on content, year, author, or other search criteria, to allow fur­ ther searches to be restricted to a specific group of works. Staff from the main TCP office in Ann Ar­ bor returned home with thick quires of notes and suggestions. Though sorting through it all will take a long time and careful analysis, the team charged with creating the interface got what it hoped for. They will now set to work creating a powerful “Renaissance computer,” such as the one the likes of John Donne and Thomas Hobbes dreamed of 400 years ago. Notes 1. An image of the page in Donne’s A n atom y o f th e World (London, 1611) con­ taining this poem is available online from EEBO at http://wwwlib.umi.com/eebo/ image/5655/10. An 2. A host o f these innovations are detailed in a new book, The R en a is sa n ce Computer: K now ledge Technology in th e F irst Age o f Print (London: Routledge, 2000), edited by Neil Rhodes and Jonathan Sawday. 3. Example of EEBO text are available at the EEBO “Featured Content” page at http:// wwwlib.umi.com/eebo/featured. There you can download, for example, images of every page of the l600 edition of Shakespeare’s M id­ su m m er Nights D ream . 4. “Hypertext” and “intertextuality” are new words, but the concepts behind them are hundreds of years old. In this engraving from the late l6th century, a scholar follows “links” that lead him from one text to the next, using an elaborate wheel instead of today’s mouse click to navigate intertextual space. ■ ( “M odelin g . . . ” c o n tin u e d f r o m p a g e 9 6 ) students should learn better and retain the skills you want to teach. Conclusion Those of us who have been developing library instructional tutorials in different formats have been constantly trying to improve our prod­ ucts. I have argued here that the use of more images is required when creating Web tutori­ als if we are to take full advantage of the con­ cept of modeling in social learning theory. Be­ cause a large part o f student learning is visual, we need to use images that stimulate, moti­ vate, and reinforce our tutorial content; using a familiar and appropriate metaphor is a use­ ful way of doing this. Notes 1. Albert Bandura, S o c ia l le a rn in g theory (E n g lew o o d Cliffs, N .J.: P ren tice Hall; Toronto: Prentice-Hall of Canada, 1977). 2. Library Research Instruction, Faculty of Applied Health Sciences, Brock Univer­ sity Web page at http://spartan.ac.brocku.ca/ -dsuarez/physeduc/index.html. 3. Library Research Instruction Depart­ ment o f Community Health Sciences Web page at http://spartan.ac.brocku.ca~dsuarez/ physeduc/chsgen_develop.htm. 4. Exercise Quiz, Department o f Recre­ ation and Leisure Web page at http:// spartan. a c . b rocku . ca/~dsuarez/physeduc/ reclgeneric_quiz.htm. ■ http://wwwlib.umi.com/eebo/ http://spartan.ac.brocku.ca/ http://spartan.ac.brocku.ca~dsuarez/